Module talk:la-pronunc/Archive 2014–2020

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Double-l[edit]

On pollinem, it has output dark l followed by light l. That doesn't seem right, as the outcome in later languages suggests the pronunciation was a geminate light l in that case. Should this be fixed? —CodeCat 17:45, 18 June 2014 (UTC)Reply

@CodeCat Well, I just followed the ninth point in the notes in w:Latin spelling and pronunciation#Consonants. --kc_kennylau (talk) 17:53, 18 June 2014 (UTC)Reply
But it says: "According to Andrew Sihler, comparative evidence indicates that, when after a vowel, el exīlis [l] occurred before an /i/ or another /l/, while el pinguis [ɫ] occurred in all other circumstances." So it seems that the realisation of /ll/ was [lː], not [ɫː], and [lɫ] or [ɫl] seem very improbable in any language. —CodeCat 17:57, 18 June 2014 (UTC)Reply
@CodeCat Changed accordingly. --kc_kennylau (talk) 18:08, 18 June 2014 (UTC)Reply
I just noticed patella has a similar problem but in reverse. It should be geminate [ll] (light) here also, as confirmed by all the descendants which have a palatalised l here. —CodeCat 23:04, 31 July 2014 (UTC)Reply
Just to note that in Catalan decendants, i.e. pol·len from Latin pollen, it is pronounced [ɫː] or [ɫɫ]. I think Catalan and Portuguese are the only Romance languages with pinguis and Catalan and Italian are the only ones with double L not palatalised, so Catalan descendants with "l·l" are a good comparative, although not conclusive about original Latin. --Vriullop (talk) 09:09, 1 August 2014 (UTC)Reply
l·l occurs in Catalan exclusively in loanwords though, precisely because inherited double l became palatal ll. —CodeCat 10:24, 1 August 2014 (UTC)Reply
Most l·l are modern loanwords, but not exclusively: col·legi dates al least from 14th century. In Balearic dialects, more close to old Catalan, the sound [ɫɫ] is usual, i.e. al·lot or diminutives -el·lo like Italian -ello. Anyway, I am not sure about classical Latin. --Vriullop (talk) 11:45, 1 August 2014 (UTC)Reply

Leibnitius[edit]

It seems that the module is doing the syllable division wrong here. Is that something that can be fixed? —CodeCat 00:55, 2 July 2014 (UTC)Reply

Fixed. --kc_kennylau (talk) 13:29, 2 July 2014 (UTC)Reply

Intervocalic consonantal i and v[edit]

It seems that the module doesn't recognize that intervocalic consonantal i is usually geminate: inputting maior yields (Classical) IPA(key): /ˈmai̯.i̯or/, [ˈmäi̯ːɔr]

Intervocalic consonantal u shows sort of the opposite behavior: usually single, except in Greek loans, such as evangelium, because Ancient Greek had doubled intervocalic semivowels. (I think that word has an inaccurate transcription: the v, not the e, should be long. The macron on the e is an attempt to represent the fact that the first syllable is heavy, like the macron sometimes written on the first vowel of cuius or maior.) — Eru·tuon 03:49, 18 September 2016 (UTC)Reply

@Erutuon: Do you mean that the macron indicates syllabic stress, overriding the Latin stress rules? --kc_kennylau (talk) 10:29, 3 November 2016 (UTC)Reply
I think it's simply that the macron is indicating an increase in syllable weight, which almost always means the vowel is long vowel, but sometimes instead means the following consonant is doubled (does this happen only with semivowels?). --WikiTiki89 13:16, 3 November 2016 (UTC)Reply
@Erutuon: I do not intend on fixing the first one because there are too many exceptions. Please use {{la-IPA|majjor}} to produce (Classical) IPA(key): /ˈmai̯.i̯or/, [ˈmäi̯ːɔr]
@Kc kennylau: I suppose it kind of makes sense to feed the phonetic spelling of the word into the IPA template. I was thinking that perhaps the few cases without doubled /jj/ could be given using a hyphen (for instance, trā-jectus); then any cases without hyphen would have doubled jj. But I am not sure if manually inputting the lack of gemination is more or less work than manually inputting gemination.
I'm not sure what to do with evangelium. If I input evvangelium, the Classical IPA is right, but Ecclesiastical is wrong (see this revision) – /eu̯.vanˈd͡ʒe.li.um/. Is there a way to tell the module to only display Ecclesiastical, so I can input one spelling for Classical and another for Ecclesiastical? — Eru·tuon 20:45, 3 November 2016 (UTC)Reply
@Erutuon: There are actually not just a "few cases"; there are many cases without the geminated consonant.
Do you mean that it should be pronounced IPA(key): /eu̯.wanˈɡe.li.um/ in Classical Latin but IPA(key): /e.vanˈd͡ʒe.li.um/ in Ecclesiastical Latin? --kc_kennylau (talk)
@Kc kennylau: I haven't gone through and done a census of the exceptions. I suppose you are right, since there are quite a few verbs beginning in /j/ with prepositional prefixes added. Is that what you're referring to?
Yes, those two transcriptions of evangelium are more or less accurate. W. Sidney Allen said that the Ancient Greek doubled semivowel, which appears when a diphthong is followed by a vowel, was borrowed into Latin. I say more or less accurate because the Classical Latin could also be transcribed IPA(key): /ew.wanˈɡe.li.um/, since a non-syllabic /u̯/ is roughly equivalent to a semivowel /w/. I am not sure how to decide between the two transcriptions. {{grc-IPA}} doesn't currently transcribe this feature: see the first IPA transcription at εὐαγγέλιον (euangélion). — Eru·tuon 18:16, 4 November 2016 (UTC)Reply
@Erutuon: Why is the "i" not geminated in Cāiēta? --kc_kennylau (talk) 14:06, 5 November 2016 (UTC)Reply
@Kc kennylau: The transcription is puzzling. I think it should be IPA(key): /kajˈjeː.ta/, or at the very least that the i should be a consonant – IPA(key): /kaːˈjeː.ta/ – since the Greek form has a diphthong. — Eru·tuon 17:32, 5 November 2016 (UTC)Reply
@I'm so meta even this acronym: Why did you put {{la-IPA|Cāi.ēta}} for Cāiēta? --kc_kennylau (talk) 00:45, 6 November 2016 (UTC)Reply
@kc_kennylau: I inferred from L&S and Gaffiot both lemmatising the spelling “Cāiēta” that the i is a vowel, since they would lemmatise *“Cājēta” if it were a consonant. — I.S.M.E.T.A. 01:02, 6 November 2016 (UTC)Reply
@Erutuon: ISMETA does have a point. --kc_kennylau (talk) 01:23, 6 November 2016 (UTC)Reply

Aeneis VI.900:
Tum se ad |Caie|tae rec|to fert |limite |portum.
Aeneis VII.2:
aeter|nam mori|ens fa|mam, Cai|eta, de|disti;

@Kc kennylau, I'm so meta even this acronym: That's a valid inference, but I was curious to see what the scansion would tell us. It's used in two lines of the Aeneid, which can be seen above. It seems ai can either be a diphthong in both cases (a͡i), or else in the first case aie is a sequence of a long vowel and two short vowels (āĭĕ) and in the second case ai is a sequence of two short vowels (ăĭ). The more parsimonious analysis would be that ai is a diphthong. Not sure why L&S don't write the word as Cājēta. — Eru·tuon 01:46, 6 November 2016 (UTC)Reply
@Erutuon: The more parsimonious analysis would be that "ai" can be both a diphthong and two monophthongs in poetry. A reason that L&S doesn't write the word as Cājēta might be that they want to preserve the diphthong "āi". --kc_kennylau (talk) 01:57, 6 November 2016 (UTC)Reply
@Kc kennylau: Actually, I would have assumed that by writing Caieta as Cāiēta, L&S would be indicating a non-diphthongal pronunciation Cāĭētă. Cājēta would indicate the diphthongal pronunciation Cajjētă (analogous to mājus = majjus). By diphthongal, I mean having a short vowel followed by a semivowel.
Cajjēt- makes the most sense because the same analysis can be used in both of the lines from the Aeneid. The alternative analysis Cāĭĕt- in the first of the two lines quoted above would contradict the vowel length in the Ancient Greek etymon Καιήτη (Kaiḗtē), and the alternative analysis Căĭē- in the second of the two lines is unnecessary, because the diphthong /aj/ (Caj(jēt)-) works just fine in the meter (unless there's some reason why the foot that the syllable is in has to be a dactyl rather than a spondee). In addition, the analysis Cajjēt- agrees with the Greek, which has the diphthong αι (ai), not the disyllabic sequence ᾱῐ (āi). — Eru·tuon 03:14, 6 November 2016 (UTC)Reply

────────────────────────────────────────────────────────────────────────────────────────────────────

@Erutuon: How is iēiūnus pronounced?

I think it's probably got the doubled semivowel: jejjūnus. The intervocalic j came from PIE *ǵy, and the fact that it comes from a sequence of two consonants suggests it's double. — Eru·tuon 03:32, 6 November 2016 (UTC)Reply
@Erutuon: How is -ēius pronounced? --kc_kennylau (talk) 05:22, 6 November 2016 (UTC)Reply
@Erutuon: How is stoicheiologia pronounced? --kc_kennylau (talk) 05:35, 6 November 2016 (UTC)Reply
@Kc kennylau: I think stoicheiologia would have -ejj-. Not sure about -ēius. It looks like it has three different etymologies, and each one might be different. The one from Greek is probably -ēĭŭs. — Eru·tuon 20:51, 6 November 2016 (UTC)Reply
@Erutuon: I'm only interested in the first etymology. The other two probably are -ēĭŭs. --kc_kennylau (talk) 10:19, 7 November 2016 (UTC)Reply
@Erutuon: I think this settles it. --kc_kennylau (talk) 10:43, 7 November 2016 (UTC)Reply
@Erutuon: But this is trisyllabic... --kc_kennylau (talk) 10:48, 7 November 2016 (UTC)Reply
@Erutuon: And this is polymorphic... --kc_kennylau (talk) 10:55, 7 November 2016 (UTC)Reply
@Erutuon: I think I agree with you re Ca͡iēta. Could you explain your parenthetical analysis, viz. “unless there's some reason why the foot that the syllable is in has to be a dactyl rather than a spondee”, please? @kc_kennylau: It should be sto͡iche͡iologia; I'm not sure how that affects the IPA, but it definitely shouldn't begin /sto.i.kʰ/… Maybe /stoj.kʰ/ or /stoi̯.kʰ/? — I.S.M.E.T.A. 07:33, 8 November 2016 (UTC)Reply
@kc_kennylau: The OLD doesn't have an entry corresponding to L&S's Ĕlătēïus. It does have entries for plēbēius (plebs +‎ -ius), Pompē(i)ī [sic], and Pompēius, but unfortunately it makes no remarks concerning numbers of syllables or diphthongisation. — I.S.M.E.T.A. 07:56, 8 November 2016 (UTC)Reply
@I'm so meta even this acronym: Latin does not have the diphthong "oi". I have changed it to "oe". --kc_kennylau (talk) 12:57, 8 November 2016 (UTC)Reply
@kc_kennylau: That's probably fine. Stoicheiologia is just an unnaturalised transcription of στοιχειολογία (stoikheiología), really. — I.S.M.E.T.A. 20:12, 8 November 2016 (UTC)Reply
@I'm so meta even this acronym: It was something I read about dactylic hexameter, maybe relating to Homer. I looked at my book of the first 12 books of the Odyssey, and it says that spondees are less common in the third and fifth feet. I was thinking that some generalization like that might be true for Latin as well. But since it's just a generalization and not a rule, it's probably not relevant. — Eru·tuon 20:23, 8 November 2016 (UTC)Reply
@Erutuon: OK. I need to teach myself scansion. My understanding of it is most limited. — I.S.M.E.T.A. 21:45, 8 November 2016 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Erutuon, kc_kennylau: Re parasceūē, do y'all think Gaffiot meant to signify “părasce͡uē”? — I.S.M.E.T.A. 22:29, 8 November 2016 (UTC)Reply

This anon. seems to think so. — I.S.M.E.T.A. 17:16, 6 December 2016 (UTC)Reply

aspiration[edit]

@CodeCat, Metaknowledge, Erutuon, Kc kennylau, JohnC5: I noticed this module transcribes th, ph, ch, and gh with aspiration ((Classical) IPA(key): /a.tʰa.pʰaˈkʰaɡ.ha/, [ät̪ʰäpʰäˈkʰäɡ(ɦ)ä]

I was taught that it was indeed aspirated, at least at the time when words were being borrowed from Greek (very few native words have this). I don't know what gh is doing at all. —Μετάknowledgediscuss/deeds 17:53, 2 November 2016 (UTC)Reply
@CodeCat, Metaknowledge, Erutuon, Wikitiki89, JohnC5: I cannot find any word beginning in gh except the (obviously) New Latin word ghanensis. I do not reject that gh might occur in the middle of a word, but I have no idea how to search for that. --kc_kennylau (talk) 10:30, 3 November 2016 (UTC)Reply
@Kc kennylau: Perseus contains no mentions of gh. Wikiling has 101 results containing the sequence, but to my eye, they all appear to be Medieval Latin and thus irrelevant to this discussion. —JohnC5 14:42, 3 November 2016 (UTC)Reply
What does the actual research say? People who teach languages are not always experts on the finer linguistic details of the language. From my perspective, most speakers of languages that do not have phonemic aspiration do not perceive aspiration and it is therefore unlikely that they would have borrowed phonemic aspiration from another language. --WikiTiki89 13:20, 3 November 2016 (UTC)Reply
Generally, people who teach languages know little about language itself — but it's different in the classics. Here's Wikipedia: "The aspirated consonants /pʰ tʰ kʰ/ as distinctive phonemes were originally foreign to Latin, appearing in educated loanwords and names from Greek. In such cases, the aspiration was likely produced only by educated speakers." I like to bring up Catullus 84, which is a poem making fun of a lower-class man who tries to seem more educated by aspirating more, but fails utterly because most of his aspirations are just hypercorrections. —Μετάknowledgediscuss/deeds 17:01, 3 November 2016 (UTC)Reply
Ok, so then shouldn't they be marked with optional aspiration or something like that? I feel like this is roughly equivalent to how English-speaking francophiles have different pronunciations of some words borrowed from French than the average English speaker (for these words, we normally give both pronunciations separately). --WikiTiki89 17:22, 3 November 2016 (UTC)Reply

circum- assimilation[edit]

@CodeCat, Metaknowledge, Erutuon, Kc kennylau, Wikitiki89: This anon has recently been converting circum into circun in the the {{la-IPA}} call before no labial consonants (e.g.). Is there any evidence of this assimilation? I just wanted to make sure before the editor changes everything. The person has also been making some other interesting changes that should be examined more carefully. —JohnC5 22:18, 27 November 2016 (UTC)Reply

g and ɡ[edit]

@JohnC5 There's an incorrect "g" slipping through on ghanensis- do you know how to fix that? I don't want to break anything. DTLHS (talk) 01:17, 22 February 2017 (UTC)Reply

@DTLHS: I believe this was the culprit! Please tell me if there are any other issues. —JohnC5 01:33, 22 February 2017 (UTC)Reply
I believe it was decided at some point that the regular g was actually preferred over the "IPA" ɡ. Who changed the IPA module to complain about the regular g? --WikiTiki89 02:26, 22 February 2017 (UTC)Reply
I am pretty sure that was not decided, considering that it would go against our standard practice, and would really need to be decided by a vote, which most certainly has not occurred. —Μετάknowledgediscuss/deeds 02:37, 22 February 2017 (UTC)Reply

Long vowels in phonetic Ecclesiastical Latin[edit]

At exhauriō I found this:

Shouldn't that long vowel mark not appear after a diphthong? If so, is there anyone who could fix this? Thanks. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:32, 17 June 2017 (UTC)Reply

@IvanScrooge98: That was definitely wrong. Fixed. — Eru·tuon 15:28, 17 June 2017 (UTC)Reply

Intervocalic s in Ecclesiastical Latin[edit]

At ecclesia (and presumably elsewhere), the module renders the "s" as /s/ (and [s]) rather than /z/. Every pronunciation guide to Ecclesiastic Latin that I have read (which is a fair number) says that intervocalic s is always pronounced like a Z (one source said it is a "soft Z sound", by which they were presumably referring to initial devoicing of the z). Could someone fix this please? Andrew Sheedy (talk) 02:10, 3 May 2019 (UTC)Reply

I agree with the proposed change and may try to implement it, though I'm curious why it wasn't done before. I think I've heard that intervocalic s used to be voiceless and now is pronounced voiced, though that might have been in Standard Italian rather than Ecclesiastical Latin. (Soft is actually an old word for voiced.) — Eru·tuon 04:25, 3 May 2019 (UTC)Reply
Thanks. Yeah, perhaps the pronunciation has changed over time, though I think the /z/ pronunciation is very well established by now. I wasn't aware that soft used to refer to voiced sounds, so I may have misinterpreted what that source was saying. I think there was some elaboration that led me to understand it the way I did, though. Andrew Sheedy (talk) 15:38, 3 May 2019 (UTC)Reply
As far as I know, this is one of the areas where there is no real consensus about the Ecclesiastical Latin pronunciation. The issue is that "Ecclesiastical Latin" is in origin basically Latin as pronounced by Italians, and different regions of Italy have different patterns of usage for [s] vs. [z]. When you look at guides about pronouncing Ecclesiastical Latin that are aimed at or written by English speakers, you may be dealing with English speakers' ideas about how Italians pronounce the letter S, which adds even more room for variation. I made a post here summarizing the conflicting sources that I have seen: "Is “s” between two vowels voiced or unvoiced?", Latin Language Stack Exchange At least some sources say to use voiceless [s] or some value "between" English /s/ and /z/. --Urszag (talk) 00:52, 4 May 2019 (UTC)Reply

Stress in Ecclesiastical Latin for words with ae in the penult syllable followed by a vowel[edit]

I'm not totally certain, but I have the impression that there are no systematic differences in the position of stress between Ecclesiastical Latin and Classical Latin. However, the Latin IPA pronunciation module currently seems to make a difference with words like iudaeus and Matthaeus, showing penultimate stress for Classical Latin and antepenultimate stress for Ecclesiastical Latin. I'm not sure what exactly is causing this; perhaps it's simplifying ae to ɛ before applying the stress assignment rule. Unless the first sentence of this paragraph is wrong, I think it would be easier to avoid such errors if stress were always calculated based on the Classical Latin pronunciation.--Urszag (talk) 01:08, 4 May 2019 (UTC)Reply

Actually, I have found one reference that suggests a certain systematic difference in stress, but it isn't one that is currently applied by the template. In "On the Pronunciation of Latin", from The Irish Ecclesiastical Record Volume V, 1884, it is said that learned words ending in -ia from Greek were accented in that time on the penult. That is, Italian speakers pronounced Latin theologia like Italian teologia [teoloˈdʒia]. I don't know whether that continues to be true in the Ecclesiastical Latin of today.--Urszag (talk) 01:40, 4 May 2019 (UTC)Reply
Having the module translate ae to ɛː at the beginning of the process fixes the stress in the testcases at least. I don't entirely understand the module, so it's possible that this has bad side-effects somewhere. — Eru·tuon 03:07, 4 May 2019 (UTC)Reply

Consonant cluster syllabification for prefixed words: should be sub.latus, ad.rogo rather than su.blatus and a.drogo[edit]

In Classical Latin, the final consonant of a prefix was "re-syllabified" into the onset of a following syllable before a vowel: e.g. sŭ.burbium "suburb". But resyllabification does not seem to have occurred before another consonant, even when the two consonants would have made a valid onset: words like sublatus, adrogo, obligo are apparently syllabified in poetry as sub.latus, ad.rogo, ob.ligo etc, with a heavy first syllable. A source that gives this pattern is Syllable and Segment in Latin, by Ranjan Sen: "Morphologically governed syllabification of TR is an unambiguous feature of Latin in the literary period. Scansion of early and Classical Latin verse clearly indicates that TR was heterosyllabic when found across a prefix + root boundary, as the preceding syllable was scanned long, hence was heavy due to closure (Allen 1973: 140)" (p. 93). Sen gives syllabifications splitting b from a following sonorant for ab.ripio, ab.rumpo, ob.lino and ob.ligo. As far as I can tell, a complete list of the prefixes where makes a relevant difference to the weight of the first syllable is ab, ob, sub, and ad. It seems like it wouldn't be ideal to have to put these syllabifications in manually for each word with one of these prefixes, but I'm not sure about the best automatic rule. I think that there are not very many unprefixed Latin words starting with abr, abl, obl, obr, adr, subl, subr, but there may be some.--Urszag (talk) 20:07, 10 May 2019 (UTC)Reply

The module can't know when there's a prefix. Words starting with ad- may be prefixed, but they may also not be. I suggest using the . notation, e.g. {{la-IPA|ad.rogō}}(Classical) IPA(key): /ˈad.ro.ɡoː/, [ˈäd̪rɔɡoː]
@Rua, Urszag Urszag already addressed this concern. I checked the list of Latin lemmas beginning with 'abr-', 'abl-', 'adr-', 'adl-', 'subr-', 'subl-', 'obr-', 'obl-', and the only words I see where ab- ad- sub- or ob- is not a prefix are obrizus, obrussa, obryzum (all three from the same Greek root), abra (maybe a native Latin word?), and abrotonum, abrotanifolius, abrotanelloides (from Greek again). As a result I think Urszag's suggestion is a good one, esp. since the words I just mentioned can be explicitly syllabified using a.br..., o.br..., which is much better than having to syllabify all the other words (> 100). Benwing2 (talk) 19:37, 9 June 2019 (UTC)Reply

Traditional english pronunciation[edit]

Would it be possible to add the traditional English pronunciation of Latin to this module? 2601:140:8B80:F70:50A2:5304:DC35:BD6D 03:09, 2 August 2019 (UTC)Reply

Would it be possible to auto-ban anyone who dares mention that pronunciation on wiktionary? :))) Seriously, why would anyone want to read that? That would be pretty much the same as adding a Chinese Engrish pronunciation to English entries. Brutal Russian (talk) 12:23, 9 August 2019 (UTC)Reply
Meh, conceptually the Ecclesiastical Latin pronunciation is similar: it's a pronunciation influenced by Italian, while the traditional English pronunciation is influenced by English and its sound changes (though much more dramatically). But the traditional English pronunciation is probably not completely predictable, so I am not sure if it could be automatically generated. (Maybe with auxiliary notation in ambiguous cases.) It is also mostly used in Latin loanwords nowadays, so probably belongs in English entries. — Eru·tuon 18:48, 9 August 2019 (UTC)Reply

circumV, conj/v[edit]

Ok, I decided to stop postponing this:

  • Circum- before a vowel must undergo the same sandhi as any other word ending in a nasal vowel - superficially looking like a prefix does not influence its phonotactics -, with the effect that circumagere is pronounced [kɪr.kũˈ(w̃)ã.ɡɛ.rɛ], with no barbaric mytacisms in there. Thus the variant spellings CIRCVMEO~CIRCVEO, but regularly CIRCVITVS with no M, presumably showing the dropping of the glide between two high vowels. There are a few words like comēsse, comitium which were lexicalised with an [m] before it dropped, and which aren't counter-examples to this synchronic rule.
  • A similar yet different issue arises in words in conj-, particularly in the one spelt CONIVNX~COIVNX~COIVX, which shows that J after N~M was treated like any other fricative, with the effect of N-deletion with concomitant nasalisation, and with a nasal approximant [j̃] to boot, as there's no other way I see it existing between two nasal vowels: [cõːj̃ũːks]~[cõj̃.j̃ũː(ŋ)ks]. Might want to skip the diacritic for a looser transcription though - Palatino Linotype doesn't handle this one well at any rate. Cser 2016 opines: "Before glides the nasal is present in the spelling, suggesting coalescence into a long nasal vowel. There is one exception: conjicere ‘throw’ is attested more than sporadically as 〈coicere〉, which probably suggests a totally assimilated nasal, i.e. [kojjikere]." He goes on to suggest that "the spelling 〈con〉 stood for [kõː] before glides as well as before fricatives [..] convocare [kõːw-]" - a logically sound conclusion, though the transcriptions are a bit messy. I personally wouldn't outright argue for [kõːw-] because the spellings COMV- crop up a bit late and I'm not aware of any COV- spelling. I attibute the difference to the fricative, rather than semivocalic, nature of the Latin /j/ (what with the gemination and all), and the MV spelling would thus be consistent the fricativisation of /w/, though that would mean that the NS-rule deletion had ceased to be productive by then. You already have to manually specify /conj/ (otherwise it parses I as a vowel), but it probably shouldn't apply to /Vnj/ combination, in case it represents a secondary consonantal /j/ as in Vergil's /lāvīnja/. Also, I'm not sure if this should be extended to injicere and other possible cases of /inj-/.
  • circum-(j)icere combines the two issues, and in fact is likely to have had two pronunciations: the earlier [kɪr.cũ.(w̃)ɪ~e-] and the later [kɪr.kũj̃.j̃ɪ~e-], as all other compounds of jacere. Brutal Russian (talk) 14:11, 9 August 2019 (UTC)Reply
@Benwing2@Erutuon Suddenly, an obvious solution to the problem: currently the final -m is phonemically transcribed as /m/, which is of course plain wrong (even the Romans felt that M was unsuitable to spell it). To let the module decide how to treat it, it needs to be correctly identified as the placeless nasal /N/, behaving along the lines set out in above and in Cser 2016. Every final -m spelling will thus be given this value by default, but it will be possible to specify it in transcription where it occurs word-internally. This seems both scientifically correct and practically desirable - would anyone be kind and willing to implement it? Brutal Russian (talk) 15:31, 20 August 2019 (UTC)Reply

Non-phonemic nasalisation before nasals[edit]

And since we're talking about nasal vowels, it's plain to see that Latin was a very nasalising language, yet its many nasalised vowels don't seem to have been phonetic, or at least very passingly so. Thus I don't think it would be possible to maintain that all vowels before nasals, as well as before/after other nasal vowels, weren't allophonically nasalised as well. This happens in languages like Spanish (reflected in the orthography on this site), and can be seen in historic vowel changes before nasals in many a Romance language (there's a whole book on this, Nasal Vowel Evolution in Romance). For evidence inside Latin I can highlight spellings like DECEBRIS for Decembris - some more examples here - where the omission of the nasal in spelling parallels its omission word-finally, indicating the same phonetics in both cases. This stackeschange thread has three replies opine in favour. Since I haven't read any work that would specifically argue this point of view, I'm posting this separately to see if anyone can offer argument against this - though I sturggle to imagine anyone giving a good reason to believe that tam is [tã:], whereas tandem is [tandẽ:] with specifically denasalised /a/, or that tandem mē is [tandẽm.me:] while tamen mē is [ta.mem.me:] and specifically not [ta.mẽm.m:e]. Indeed, that would involve postulating non-nasal allophones for what are already - by every account I've seen - allophones of oral vowels and not separate nasal vowel phonemes. And even if we assume phonemic nasalisation, the prime examples here are Porttuguese and to a lesser degree French, both of which show tautosyllabic nasalisation, and Portuguese has progressive as well as regressive nasalisation across syllable boundary. I would be happy with tautosyllabic (inside the same syllable) nasalisation for Latin - is anybody willing to implement that given consensus? Brutal Russian (talk) 14:34, 9 August 2019 (UTC)Reply

@Brutal Russian I'm a bit confused as to exactly what you want implemented, maybe you can give a set of examples. Also, note that in European Portuguese, vowels are not (or rather, no longer I think) nasalized before M/N + vowel, and not nasalized before /nd/ (which is pronounced as two consonants, at least in the gerund ending in -ndo). Maybe even more to the point is modern Polish, where written nasal vowels ą ę are pronounced as sequences of oral vowel + consonant m/n except word finally or before /s/ /f/ maybe also /x/ (very much like in Classical Latin). This suggests that tandem [tandẽ] is not so far-fetched, along with the fact that modern Spanish and Italian presumably reflect exactly such pronunciation, given the fact that you get e.g. pontem [pɔntẽ] -> puente/ponte where Latin nasal vowels regularly lose the nasalization but there's no evidence of the n ever having been a nasal vowel. (Yes, Italian ponte [ponte] has an unexpected high-mid vowel but that can easily have been a much later change. Spanish puente/fuente/frente show that Proto-Romance had [ɔ].) Benwing2 (talk) 17:07, 18 August 2019 (UTC)Reply
@Benwing2 I'm proposing that a syllable containing a nasal nasalise the preceding vowel in that syllable (regardless of length, logically including dithphongs): movendō [mɔwɛ̃ndo:], cantandō [kãntãndo:], deinde [ˈdɛ̃i̯n.dɛ]. Portuguese does nasalise them before /nd/: movendo, cantando. In that language, the complication is that nasalisation is clearly phonemic and accompanied by raising and/or diphthongisation, so there can be a contrast between an allophonically nasalised low [ã] and a phonemically nasalised raised [ɐ̃] (subjectively, there seems to be variation in regard to this particularly in names and borrowings). I doubt that the degree of actual nasalisation differs between those, and if it does, this is phonemically determined - in Latin, as in Spanish, this wouldn't be an issue as nasalisation doesn't seem to affect vowel quality.
Regarding /pontem/, you seem to be confusing nasalisation with vowel length - and we've just made a change to this! Both nasalised and oral vowels have the same quality and therefore the same reflexes in Romance, it's only the length that matters. The mid-high vowel in pōns is due to it being phonemically long, not nasalised (and whether it actually was is doubtful). puente just points to the short length and the concomitant mid-low quality, not to presence or lack of nasalisation - but c.f monte! This type of words is problematic since they show different length reflexes inside the same language (more examples in Lindsay 1894 IIRC) - easily explained as interference from the Nominative.
Regarding Polish, I think you will agree that the vowel of dąb is plainly nasal, nor do I detect any difference in nasality between dzięki and język. The fact that the placeless nasal acquires place from the following consonant in that language - as well as in many other languages, subjectively including Portuguese - is separate from the process of denasalising the vowel itself, which is, I argue, a complication to the phonology that needs to be justified - and spellings like DECEBRIS certainly don't serve to justify it. The status of nasalisation in Polish is disputed and clearly in a state of change, and the rate of occurence of nasal vowels is way lower than in Latin. And even with that, Polish has the very same phenomenon that I'm proposing to implement - or, rather, the lack of the very same phenomenon of denasalisation that I'm proposing to remove (in the standard pronunciation presented at Forvo, at least). Brutal Russian (talk) 14:40, 20 August 2019 (UTC)Reply

final nasalization again[edit]

(moved from User talk:Benwing2)

I think some of the recent changes must have broken the handling of long vowels, so that now any word that has a long vowel doesn't display the phonetic transcription. In addition, I'm not aware of a single inscription that would mark final nasal vowels as long. Plenty of them mark the vowels before NS sequences as long, which we also assume to be nasalised - and when the N was restored, the vowel stayed long (Allen 1978: "but the classical pronunciation WITH n also has a long vowel", evidence p.65). The non-shift of the final /um/ in Central/South Italian can be attributed to several factors, starting from there having been no such change there at all (the Dacian/South Italian Romance vowel system with /i~e/ but no /u~o/ merger) and all the current such reflexes being reshaped - possibly even word by word - on the Centre-North dialects, and ending with the fact that nasal vowels are perceptually higher than oral ones, thus being primed to escape the lowering of the /u/ (c.f. StIt ponte with a closed /o/ instead of the expected open one). The French rien alone, reflecing /rĕm/ (through the recomposition of the nasal vowel into oral + /n/), looks like a good refutation of the idea that these vowels were long. Besides, most attempts at these sound horrible, particlarly from English speakers (Stephen Daitz, A.Z. Foreman).

On a general note, would it be possible to consult me on any changes of phonetic nature you're going to do to the module? I'm very invested in the reconstruction of Latin pronunciation and I bet I could help more often than not. Tanks :3 Brutal Russian (talk) 13:37, 18 August 2019 (UTC)Reply

Actually, here's again from Allen: "The preceding vowel is in fact always short.[..]For the other vowels shortness is attested by an express statement of Priscian (K. ii, 23): 'numquam tamen eadem m ante se natura longam (uocalem) patitur in eadem syllaba esse, ut illam, artem, puppim, ilium, rem'. Short vowel is also attested for the last word by French rien (as bien from bĕn(e) and not as rein from rēn)." Brutal Russian (talk) 14:27, 18 August 2019 (UTC)Reply
@Brutal Russian Yes, absolutely, I will consult you in the future. I haven't made any changes in quite awhile to that module and I'm not planning on any changes, but given what you've said above I'll revert my change that lengthened vowels before final nasal m. Can you point me to an example where a long vowel is causing the phonetic transcription not to display? I can easily find counterexamples such as actuālis. Benwing2 (talk) 15:09, 18 August 2019 (UTC)Reply
BTW there is one thing I was thinking of changing that I'd like to ask you about. The module claims that VL /t/ was palatalized to [tʲ] before /e/ and /i/, like in Russian. This seems very doubtful to me; there's no evidence of such palatalization, for example, in the reflexes of tempus in most languages. I was thinking of fixing this; what do you think? Benwing2 (talk) 15:11, 18 August 2019 (UTC)Reply
@Benwing2 Heh, I should have linked a word or two after all. It seems that any time the word contains the vowel a, short or long, the transcription breaks: faciō but virgō.
On the other count, I remember being surprised myself when I read that, but I never looked it up to try and find any mentions in the literature. My hunch is that it's an extrapolation of the assibilation of /tiV/ as well as the Romance dipthongisation. Though the latter would also imply a parallel velarisation of back vowels apart from /a/: tenet > /tʲene/, tonat > /tˠona/. Again, I haven't read any such suggestion before. I'll look around. Also, tanks again! Brutal Russian (talk) 15:27, 18 August 2019 (UTC)Reply
Ok, so after some more experimentaion it's not just the vowel a (actuālis after all), but some combination of that and two consecutive vowels in the final syllable. Brutal Russian (talk) 15:35, 18 August 2019 (UTC)Reply
@Brutal Russian Not sure what the issue with faciō is. The module doesn't display the phonetic outcome if it's identical to the phonemic outcome. It's always worked this way, although if you find it confusing we can change it. Benwing2 (talk) 15:36, 18 August 2019 (UTC)Reply
BTW the reason the phonetic outcome is identical for faciō is that there's a rule raising unstressed [ɪ] to [i] before a vowel. Benwing2 (talk) 15:38, 18 August 2019 (UTC)Reply
Haha, well I guess I do find it rather confusing! No objections to it being changed from me :P Also, while we're on this topic, what do you think about the two (actually more) issues I raised on the pronunciation module's talk page recently, and the technical possibility of their correction? The allophonic nasalisation code should be stealable from Spanish, for instance. Good occasion for me to fix the vowel length of āctuālis, by the way. Brutal Russian (talk) 15:56, 18 August 2019 (UTC)Reply
Another thought on the nasal vowel matter - if e.g. French reflexes indicate quantity rather than quality (as in pre-vocalic short vowel raising), it follows that all the final short vowels should be the short nasalised allophones, e.g. the open [ɛ̃], not the current closed [ẽ]. This would of course mean that the raising effect of nasalisation either didn't exist or didn't affect phonemic outcomes, at least in Gallia (indeed, the latest round of French nasalisation lowered the front ones instead), but I don't know of any evidence that would counter-weigh the evidence of bien/rien. Still, this would mean a need for separate treatment when followed by NS (>> long and closed) or not (short and open). This might be easier to implement after implementing allophonic nasalisation, which is required for all the vowels. Brutal Russian (talk) 17:05, 18 August 2019 (UTC)Reply
@Brutal Russian I can easily implement [ɛ̃] and [ɔ̃] independently of anything else, it's just a change to two lines. I wonder however whether rien is a probative example for very many uses, given that it's stressed and may originate from a form like *rɛne (compare Latin cor -> Italian cuore); OTOH the low-mid ɛ presumes something like rem [rɛ̃] at least in monosyllables. Benwing2 (talk) 17:13, 18 August 2019 (UTC)Reply
@Brutal Russian what about final -im, -um, -ym, do we use [ɪ̃] [ʊ̃] [ʏ̃] there too? Benwing2 (talk) 17:18, 18 August 2019 (UTC)Reply
@Benwing2 Well, if you're wondering whether final short vowels on the whole might have been raised, I'm still unclear on this - Archaic Latin lowered them instead, and there are, for instance, plenty of Italian dialects whether final vowels are open. I don't recall seeing any inscriptions that would confuse a final E for I, for instance - and the other way around can all be ascribed to archaic spellings, I think. Verbal endings (with or without the consonant dropped) all seem to show mid for high vowels. *rɛne isn't possible in my view since it was an inflected form of rēs and the -n was the Accusative ending. I don't know when the -e of cuore appeared, but it looks to be either the peculiarly Italian syllabic rhyme constraint in action, or an analogy to another former neuter - latte. In any case it's different to -n since that was analogically extended with -o in sono - a final -n in general doesn't seem to violate the same constraint. I fixed the other vowels since the principle is the same and Priscian lists -am -im -em -um as short. Brutal Russian (talk) 18:42, 18 August 2019 (UTC)Reply
About the length of final vowel plus m: in the Aeneid it always scans as long, unless it's elided; for instance, in multum ille et terris iactatus et alto / vi superum saevae memorem Iunonis ob iram (1.3-4), vi superum scans as long–short–short long and in memorem Iunonis ob iram, -rem scans as long (-ram at the end is anceps). I think transcribing these cases as a lengthened nasalized lax vowel (/rem/, [rɛ̃ː]) would reconcile both the Romance reflexes and the long scansion. — Eru·tuon 17:48, 18 August 2019 (UTC)Reply
@Erutuon This is due to the fact that the placeless nasal consonant, when followed by another consonant, gets the place node from that consonant, surfacing as [n] when followed by a dental, [m] by a labial and [ŋ] by a velar, hence the spellings tanquam, tandem and compīlāre (com- had the same placeless nasal). It only ostensibly becomes long when a fricative, and possibly an /n/ follows: cōnīvēre~connīvēre - in that case almost certainly with the appropriate change of quality to that of the oral long [o]. It's a well-known fact that all long vowels shortened before final M in pre-literary Latin (e.g. Sihler 2008: "Long vowels were regularly shortened before final -m"), and I think that postulating a special contrast between long closed, short open and long open nasalised vowels, that would have to have existed for a short period, would require good evidence. Brutal Russian (talk) 18:42, 18 August 2019 (UTC)Reply
So to make it absolutely clear, /saevam deam/ is ['sae̯.wãnˈde.ã], /saevam mātrem/ ['sae̯.wãmˈmaː.trɛ̃], /saevam canem/ is ['sae̯.wãŋˈka.nɛ̃], and only /saevam sorōrem/ is ['sae̯.wãː.sɔ'roːrɛ̃]. When a /j/ follows, I personally think it was more likely doubled and nasalised than lengthened in endings (my transcription of /circumje-/ as [kɪr.kũj̃.j̃ɪ~e-]), but there's generally no contrast between the two and the CL cujus with an etymological /jj/ surfaces as Sp. cuyo, reflecting */cūjum/ - but as long as the syllable remains heavy, it's all good. The part about lengthening before /n/ I take back - judging from etymology, the root of cōnīvēre seems to have started in a velar, which itself became placeless (causing the nasal of the prefix to lose its place and nasalise+lengthen the vowel instead) and eventually dropped in initial gn-, but the prefixed forms remained e.g. īgnōrō from *gnōrō and cōnectō from *gnectō. Brutal Russian (talk) 19:25, 18 August 2019 (UTC)Reply
I chose those two examples precisely because the m would not be realized as a nasal consonant. Sounds like you're saying that a syllable ending in a vowel plus m remains long (heavy) in all positions except before a pause, where it ends in a short nasal vowel. I don't have a counterargument based on dactylic hexameter. I suppose one prepausal position might be at the end of a line in a position where we write a period, comma, or semicolon (the position of iram above); but there the syllable can be short or long. — Eru·tuon 22:03, 18 August 2019 (UTC)Reply
@Erutuon Actually, the one in /multum ille/ is realised precisely as a nasal consonant: [mʊɫ.tũ.w̃ɪ̃l.lɛ ~ mʊɫ.tw̃ɪ̃l.lɛ] (the latter presumably reflected in poetry). The avoidance of elision of nasal vowels before short vowels, just like that of long vowels and dipthongs before short vowels, indicates that they didn't get elided entirely like short vowels did. In stressed monosyllables the junction results in two full syllables, the former one short: /quem ille/ is always three syllables, though such combinations are on the whole avoided in poetry. For the discussion see Allen again, p.31 and p.78. This, the reduction of /om#/ to /um#/ and the preference for nasal vowels in the anceps position makes it clear that these were considered closed syllables, but just how they were realised when stressed and unstressed cannot be determined - the closest to a description of their sound that I know is Quintilian's "Atqui eadem illa littera, quotiens ultima est et vocalem verbi sequentis ita contingit ut in eam transire possit, etiam si scribitur, tamen parum exprimitur, ut 'multum ille' et 'quantum erat', adeo ut paene cuiusdam novae litterae sonum reddat. Neque enim eximitur sed obscuratur, et tantum in hoc aliqua inter duas vocales velut nota est, ne ipsae coeant" (which Consentius confirms when he cites Celsius "[hoc fit] pessime, ubi ei, quae tollitur, accedit et consonans, quale est 'multum ille et terris iactatus et alto'."), and the same Quintilian notes on the pronunciation of final /m/ "Quid quod pleraque nos illa quasi mugiente littera cludimus, in quam nullum Graece uerbum cadit?". From this, and from the Romance reflex of stressed nasal vowels as /Vn/ and unstressed ones as /V/ with no trace of nasalisation, my impression is that stressed nasalised vowels were heavy syllables that ended in a consonant such as [w̃] or [ŋ] (the latter is particularly common in Romance and cross-linguistically), or even a bilabial approximant or non-continuant nasal (the kind described here for Japanese), but the vowels were certainly phonemically short, and any lengthening they might have experienced was allophonic. You can get a sense for the possible realisations by browsing this pronunciation atlas - try words like Latin tenet, in and nōn (word and language selection on the right). I personally would be fine with transcribing stressed final nasal vowels as ending in a (possibly superscript) velar, labial or labiovelar nasal approximant to indicate that they were indeed closed syllables, but I have no idea how it's possible to decide on which one of these to choose, or a different one. The seemingly next best option is to mark them as semi-long, e.g. [ãˑ].
By the way, this behaviour - short in isolation but becoming a closed syllable in composition with consonants - finds an excellent parallel in the Italian w:Syntactic_gemination, with the obvious conclusion that the latter must at least partially continue the former. Brutal Russian (talk) 14:40, 19 August 2019 (UTC)Reply
In speaking about cases where the m in question was not realized as a nasal consonant, I was referring to -um in vi superum and -em in memorem Iunonis ob iram, the cases from Aeneid 1.3-4 that I singled out in my original post, not to multum ille. (As for multum ille, I was taught the traditional view that a final vowel plus m is elided, though that could be a convenient falsehood – meaning that treating it as elided typically gives the right scansion. But the actual realization is beside the point because I only mentioned the supposed elision to say that I wasn't talking about cases where it applied.) Could you clarify how your comment relates to what I was saying about the prepausal vowel plus m being short (light) and other realizations being long? I'm speaking about the transcriptions you gave – /saevam deam/ is ['sae̯.wãnˈde.ã], /saevam mātrem/ ['sae̯.wãmˈmaː.trɛ̃], /saevam canem/ is ['sae̯.wãŋˈka.nɛ̃], and only /saevam sorōrem/ is ['sae̯.wãː.sɔ'roːrɛ̃] – here each example has two cases of a word-final vowel plus m, and the non-phrase-final cases produce a long syllable, but the phrase-final a short syllable. What causes this shortening before a pause? Why is saevam sorōrem not [sae̯.wãː.sɔ'roːrɛ̃ː] replace ' with ˈ, invalid IPA characters (') for instance, so that the final syllables of saevam and sorōrem have the same length? — Eru·tuon 17:50, 19 August 2019 (UTC)Reply
It's the so-called NS-rule causing any nasal (/m/, /n/ or /N/) to lengthen and nasalise the homosyllabic vowel before fricatives, as in mēnsa, cōnsul, īnstar, cōnflō. All these vowels are originally short, but marked with the apex in inscriptions, are explicitly mentioned as being long by Cicero, Gellius and others, and their Romance reflexes, when not recomposed, are those of long vowels. In contrast, there are no instances of final nasal vowels marked with an apex that I know of, and their Romance reflexes, when stressed, are those of short vowels - in unstressed position it doesn't seem possible to tell. Brutal Russian (talk) 13:19, 20 August 2019 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Brutal Russian, Erutuon BTW, you quoted Cser 2016 above, with a link to the text; on p. 31 this text has puellam [puellãː] and fuerim [fuerĩː]. Benwing2 (talk) 02:27, 23 August 2019 (UTC)Reply

@Benwing2, Erutuon Right, but the closest thing to an explanation for this is his "the invariable length of nasal vowels" - from this postulate he extends the length found before fricatives to the final position. We certainly don't know what evidence Cser did and did not take into consideration. It's conceivable that the process of denasalisation happened everywhere at the stage when long and short pairs were as in CL (so mit paired with mīt and not with mēt), so that the long nasal vowel of rem was a close [ẽ:] like the long oral vowel /e:/, and became a sequence of the short corresponding oral vowel /e/ [ɛ̃] and /n/ like in most (all?) other instances of Romance reflexes pointing to the short vowel pair of the long vowel found in CL, e.g. Fr. froit << fructus for CL frūctus. There's actually a tentative instance of this word-internally: the Spanish pienso (I think), a doublet that might have arisen already in Late Latin as a spelling pronunciation and went on to become lexicalised in a different meaning from the inherited peso (I weigh). Perhaps someone might suggest more instances of what seem to be reflexes of short vowels before a restored nasal in non-final position? These might allow us to conclude that vowel length the associated closeness were in direct correlation with the degree of nasalisation, and that all fully nasalised vowels with full deletion of the conditioning nasal were indeed long and close. But then, when they were simply denasalised with no fall-out of /n/, one would expect them to stay closed and /um#/ to merge with the reflex of /u:/, which it didn't. The denasalisation of consul that Wikipedia describes started pre-Classically, and in CL both reflexes seem to have co-existed (e.g. Cicero is said to have spoken Forēsia, Megalēsia, Hortēsia). Brutal Russian (talk) 03:37, 27 August 2019 (UTC)Reply
And here's M. Weiss (his 2009 "Outline..." is a must-have for anyone interested in Latin and IE linguistics) adducing some more evidence for those vowels' shortness. Of course it being short in cretics doesn't exclude it being long in non-cretics... Brutal Russian (talk) 10:30, 27 August 2019 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Benwing2, Erutuon I've made final vowels semi-long as this reflects that they aren't phonetically or phonemically equal to a short vowel (being underlyingly bimoraic and as such being able to end a monosyllable, which short vowels can't do), but patterning with short ones in terms of quality. However, for me, the font properly displays them only with the vowel /a/ (quam), all others (tum, quem) being displayed with the nasalisation sign over the length mark instead of the vowel. Is this just on my end (the font of the transcription doesn't seem to change with browser settings)? If it's not, should display issues like that influence our transcription choices? Brutal Russian (talk) 17:29, 1 October 2019 (UTC)Reply

@Brutal Russian: For me quam looks okay (it's being displayed in Gentium Plus), but tum, quem are as you described. This was because the diacritic was placed after the length mark; I've fixed that. — Eru·tuon 17:38, 1 October 2019 (UTC)Reply

automatic consonantal i after prefixes[edit]

(Notifying Fay Freak, Brutal Russian, JohnC5, Lambiam): I am thinking of implementing a rule whereby i before a vowel and after any of the prefixes ab-, ad-, circum-, con-, dis-, ex-, in-, inter-, ob-, per-, sub-, subter-, super-, and trans- is rendered as /j/ by default. The reason for this is that the great majority of such cases are in fact composed of prefixes plus a consonantal i, and currently all of these need to be marked specially in the argument to {{la-IPA}} by using j instead of i. This makes it harder, e.g., to auto-generate pronunciations, as I'd like to do. As an example, looking for cases of words beginning adi- + vowel yields this:

Page 923 adiaceo: Processing
Page 924 adiaculatus: Processing
Page 925 adiantum: Processing
Page 926 adiaphoros: Processing
Page 928 adiecticius: Processing
Page 929 adiectio: Processing
Page 930 adiectivalis: Processing
Page 931 adiectivus: Processing
Page 949 adiudicatio: Processing
Page 950 adiudico: Processing
Page 951 adiuero: Processing
Page 952 adiugo: Processing
Page 953 adiumentum: Processing
Page 954 adiunctio: Processing
Page 955 adiunctior: Processing
Page 956 adiunctivus: Processing
Page 957 adiunctor: Processing
Page 958 adiunctus: Processing
Page 959 adiungo: Processing
Page 960 adiuramentum: Processing
Page 961 adiuratio: Processing
Page 962 adiurator: Processing
Page 963 adiuratorius: Processing
Page 964 adiuro: Processing
Page 965 adiutabilis: Processing
Page 966 adiuto: Processing
Page 967 adiutor: Processing
Page 968 adiutorium: Processing
Page 969 adiutrix: Processing
Page 970 adiuvo: Processing
Page 13860 fortis Fortuna adiuvat: Processing

Of these, 29 have consonantal i and only 2 (adiantum, adiaphoros) have vocalic i. This change would mean that the small number of cases of vocalic i in this circumstance would need to add a . (syllable divider) either after the i or before the consonant that precedes the i. Thoughts? Benwing2 (talk) 02:49, 23 August 2019 (UTC)Reply

As with the sub/ad/ab/ob situation I discussed above, it's too bad that the pronunciation module can't access etymological information (specifically, whether a word is prefixed). In theory, that seems like the most reliable way to do it, and entries are supposed to have information about the etymology/morphology of words anyway (which allows things like automatically generated categories for words with a certain prefix). But I don't know what can and can't be done from a technical perspective with modules. If you're going just by spelling, one possible way to exclude some false positives would be to have a list of specific bases (or more specifically, of letter sequences that occur at the start of certain bases) rather than just targeting i followed by any vowel. Latin has relatively few words starting with /j/, and I think that only a few of them are responsible for many of the compounds of the type that you're discussing, which might allow you to get fairly good coverage with a method like this. For example, for iaceo the module could look for the sequences "iac" and "iect". (Of course, iaceo also forms -ic compounds like adicio which are just not spelled according to their pronunciation.) A human could predict the syllabicity of the i in adiaphoros with some confidence because of the characteristically Greek ph digraph and -os termination, but I think it's probably not worthwhile to try to add those kinds of things as criteria to a module here.--Urszag (talk) 03:05, 23 August 2019 (UTC)Reply
Etymological info might have helped reduce the number false positives, but cannot prevent them, as for abieram and subieram, so the automatically generated pronunciations still need to be checked.  --Lambiam 09:45, 23 August 2019 (UTC)Reply
(Notifying Fay Freak, Brutal Russian, JohnC5, Lambiam): @Urszag I implemented this idea and in the process fixed several existing cases where the i had wrongly failed to be converted to j in the {{la-IPA}} template (now no longer necessary). Urszag, I thought of implementing your idea of listing out the roots beginning with /j/ but there are quite a lot of them and it seemed error-prone. If/when I get around to adding pronunciations by bot, I'll have to manually check the ones with i + vowel after what looks like a prefix. I don't think this will be too hard; it's quick to go through a list manually and pick out the ones that need manual specification, and the bot itself can have a special case to handle derivatives of . Benwing2 (talk) 19:56, 25 August 2019 (UTC)Reply
@Benwing2 I see. That seems fine. It seems unnecessary to edit to remove manually added j from IPA transcriptions that already had it, though.--Urszag (talk) 22:37, 25 August 2019 (UTC)Reply

Intervocalic i/j[edit]

(Notifying Fay Freak, Brutal Russian, JohnC5, Lambiam): @Urszag I would like to clean up the module's handling of intervocalic i and j. Currently an intervocalic i is automatically converted to /jj/ but an intervocalic /j/ is not, which makes no sense. What are the rules for intervocalic i and j? Are they always pronounced double (except maybe with certain prefixes such as trā-)? Does the preceding vowel get shortened if it's long? Benwing2 (talk) 02:21, 31 August 2019 (UTC)Reply

I don't think anyone knows. Etymolgically speaking, pompejjus and plēbējus should be different, but spellings like pompeus point to a single [j].(*) Italian has tragittare, tragitto, tragetto (notice the single /g/ unlike aggettivo), and even traghettare all of a sudden. And then you read aiutare, aggiutare and atare! >__< There are spellings like coiiux, κοζους for conjūnx (never with ω in Greek, indicating a short vowel) and this by Velius Longus: incipit per tria i scribi 'coiiicit,' ut prima syllaba sit 'coi', sequentes duae 'iicit'. I think long vowel/double /j/ were in a complimentary distribution, though in precisely which one is not apparent to me. I think postulating a double /jj/ after a short vowel and single /j/ after a long one is safe for the time being, even if it's not always clear which it is, particularly in the endings. Brutal Russian (talk) 04:16, 31 August 2019 (UTC)Reply
(*) Cassiodorus, however, says: 'Pompeiius', 'Tarpeiius' et 'eiius' per duo i scribenda sunt, et propter sonum (plenius enim sonant), et propter metrum. Numquam enim longa fiet syllaba nisi per i geminum scribatur - a pretty unambiguous indication of the shortness of the vowels themselves. There migh very well have been a range of realisations of these words - I wouldn't even exclude /'pom.peus/. Brutal Russian (talk) 05:15, 31 August 2019 (UTC)Reply

Incorrect syllabification for words with ps/bs[edit]

The pronunciation module currently syllabifies words like obsolētus as /o.psoˈleː.tus/, [ɔ.psɔˈɫeː.tʊs] (example: (Classical) IPA(key): /ob.soˈleː.tus/, [ɔps̠ɔˈɫ̪eːt̪ʊs̠]

  • (modern Italianate Ecclesiastical) IPA(key): /ob.soˈle.tus/, [obsoˈlɛːt̪us]), which is certainly incorrect: the syllable break comes between the [p] and the [s] in all Latin words containing intervocalic [ps]. (The syllabification of [ps] followed by a consonant, as in abstēmius, is more arguable, but it's not so important because both [ps.t] and [p.st] correctly predict that the preceding syllable is heavy.) I don't know enough about Greek to say whether a different treatment is possible for Greek words, but I don't remember hearing of that: even though Greek has [ps] as a valid word-initial cluster, I don't think a syllable with a short vowel is light before word-internal [ps] in Greek. Someone should fix the behavior of the module with native Latin words containing intervocalic [ps] and check whether it should ever be using this kind of syllabification for words from Greek.--Urszag (talk) 01:41, 6 September 2019 (UTC)Reply
Greek, unlike Latin (outside poetic license), resyllabifies initial clusters even at word boundary, and word-internally it's only liquid-containing clusters that can be tautosyllabic. A /ps/ onset is outright banned by Latin phonotactics. Brutal Russian (talk) 13:36, 13 September 2019 (UTC)Reply
I don't understand the logic of the module yet, but my guess is that /ps/ was included in the list of onsets because of the existence of word-initial /ps/ in loans from Greek. This module will have to work with these words, not only with words that follow native Latin phonotactics, and transcribing the cluster as an onset in this (very specific) context does seem like the best option to me. There are linguistic theories that treat word-initial /ps/ as a non-onset cluster (e.g. it can be posited that the /p/ is in an extrasyllabic word-initial "appendix" slot, or in a coda slot after an empty nucleus), but I don't think it's reasonable to expect readers to be familiar with such theories: a trancription like /pˈsel.kis/ or /p.soˈa.di.kus/ seems likely to cause confusion, so I support omitting a syllable division marker in transcriptions like /psel.kis/ and /psoˈa.di.kus/. However, I can't think of any position other than word-initial where /ps/ is best transcribed as a syllable onset. (In theory, if any Greek words with "κψ" or "σψ", like "ἔκψυξις" or "δύσψυκτος", had entered Latin, it might make sense to transcribe them with /k.ps/ and /s.ps/ ... but I can't find any examples of such words.)--Urszag (talk) 22:36, 17 September 2019 (UTC)Reply
I think I've fixed the problem, at least for the words with ps between vowels. The method of dividing consonant clusters between vowels seems a bit odd. By default the module puts as many consonants as possible in the onset of a non-initial syllable (i.e., the longest sequence that is found in the onsets table), but there are manually coded exceptions. There were exceptions for gn and for clusters beginning with s, so I added one for b or p followed by s. This feels very hacky and makes it hard to figure out if it correctly handles all cases so I think it could use rewriting. — Eru·tuon 00:00, 18 September 2019 (UTC)Reply
Thanks for the quick fix.--Urszag (talk) 05:00, 18 September 2019 (UTC)Reply

Allophones of L; Dental articulations[edit]

The current distribution - same velarised allophone in onset (except before /i/) as in coda - seems unsatisfactory to me and contradicts the statements of the grammarians, even Priscian, who cites Pliny: L triplicem, ut Plinius videtur, sonum habet: exilem, quando geminatur secundo loco posita, ut ille, Metellus; plenum, quando finit nomina vel syllabas, et quando aliquam habet ante se in eadem syllaba consonantem, ut sol, silva, flavus, clarus; medium in aliis, ut lectum, lectus. Pompeius doesn't make the triple distinction when he says that exilius autem proferenda est ubicumque ab ea verbum incipit; ut in 'lepore', 'lana', 'lupo', grouping onset with double. I propose to implement this per Sen's analysis: velarised [ɫ] in coda and before /u(:)/, /o(:)/, /a(:)/; palatalised [lʲ] when double and before /i(:)/; clear [l] before /e(:)/. On another note, I wanted to go ahead and supply /d/, /t/, /n/ and /l/ with IPA dental/denti-alveolar notation: [t̪], [d̪], [n̪], [l̪], [ɫ̪] or [l̪ˠ] - but I have a suspicion that just replacing every "n" with "n̪" might not be the way to do it. Could someone send helps? Brutal Russian (talk) 14:53, 13 September 2019 (UTC)Reply

@Brutal Russian:I've only just started to familiarize myself with the structure of the module, but it looks like you found out where to put the edits that you wanted to make (the "phonetic rules" tables--but did you mean to omit a rule for replacing /l/ with [l̪]? it currently shows up as plain [l], as in the classical transcription of ecclesia on the following page: Module:la-pronunc/testcases). I had a few questions about these changes. Is the exact position of the coronal plosives and liquids in Classical Latin known with as much certainty as the other things that we include in our transcription, or is it more of an educated guess? For consistency, if we use such a narrow transcription for Classical Latin, shouldn't we also use the symbols [t̪] and [d̪] for the Ecclesiastical Latin and Vulgar Latin phonetic transcriptions? The Wikipedia article on present-day Italian pronunciation indicates that Italian /t/ and /d/ are typically realized as laminal denti-alveolar [t̪] and [d̪], but that "/n, l, r/ are apical alveolar [n̺, l̺, r̺] in most environments"; presumably, these remarks would apply equally to the Italian traditional pronunciation of Latin, so if we're using the symbols [t̪] [d̪] for the classical phonetic transcription, I will edit those diacritics in for the Ecclesiastical phonetic transcription. I'm also a little bothered by the inconsistency in using a narrow transcription for the plosives /t/ /d/, but a transcription that is probably fairly broad for the plosives /k/ /g/, which were likely pronounced as pre-velar [k̟] [ɡ̟] before the front vowels /i(ː)/ and /e(ː)/, and for the fricative /s/, which may have had a more retracted value than English /s/. --Urszag (talk) 06:13, 6 October 2019 (UTC)Reply
@Urszag: Yes, the [l] is intentional (at least halfway) because after consulting the wiki page for the lateral approximant (specifically the remarks on Turkish which seems to have a very similar distribution) I didn't want to mark the palatalised /l/ as dental, and I'm fine with leaving the "intermediate" one, well, intermediate - I think it fits well. I believe the evidence for dentally-articulated plosives is overwhelming and I don't remember reading anyone expressing doubts about that, and in these works /n/ is generally grouped there as well, although it's rather found with /l/ and /r/ as alveolar in most of Romance. The grammarians describe all three liquids as articulated near the teeth - though the descriptions of /r/ are more vague and we need to keep in mind the rhotacism, which quite clearly points to alveolar articulation for both /r/ and /s/. For the latter, it's likely that it was at least laminal alveolar if not retracted, and it still is in large areas of Romance, but it's notoriously dental in Central Italian dialects, while Etruscan, which clearly exerted a large influence on Latin in pre-literary times, had a contrast between a palatal /ʃ/ and what is presumably the very same dental sibilant still found in those areas today. I'd love to read some studies on this topic, but I haven't come across any so far. The grammarians describe the /s/ likewise as produced around the teeth: Terentius Maurus & Marius Victōrīnus say "pōne dentēs", Mārtiānus Capella "S sibilum facit dentibus verberatis". About the velars, I don't personally see the need because such a pronunciation before front vowels is all but conditioned by the human physiology. Is there a language where they're specifically velar? That said, I remember there being indications of an assibilated pronunciation (alliteration with /s/) as early as the 3d century in an African poet. I personally think velars must have been palatalised already in CL (there are indications to this in the lack of velarisation of e->o / _lC e.g. in celsus (c.f. pulsus).
As for Ecclesiastical and Vulgar (ugh), I simply couldn't figure out which rules are taken from CL and which aren't. If you know how to, then by all means! And yeah, it seems logical that it should correspond to the pronunciation of the corresponding phonemes in Standard Italian. Brutal Russian (talk) 01:22, 7 October 2019 (UTC)Reply
@Brutal Russian:What is the overwhelming evidence for dentally articulated plosives? It seems like Occam's razor supports it, given the usual articulation of the Romance reflexes, but a shift between the alveolar and dental positions doesn't seem extremely improbable: how can we definitively exclude that possibility? I think that the alveolar ridge and the teeth are close enough that we wouldn't really be able to figure it out for sure just from native speakers' descriptions: I'm an English speaker and I can feel my tongue tip touch the base of my upper teeth when I say /t/ or /d/. And you said that the grammarians also mention the teeth in connection to /s/ and /r/, even though it seems probable that those were in fact alveolar rather than purely dental. I tried to look up more information about this on Google, but I didn't have much luck. Just because I have it available, I looked at Vox Latina (ed. 2) for Allen's description of this topic: he says that Latin /t/ is "sometimes said" to be dental rather than alveolar, but Allen interprets Terentianus Maurus, K. vi, 331 as describing an alveolar articulation for /t/ and a dental articulation for /d/. Allen doesn't express actual trust in the existence of that differentiation between the place of articulation of /t/ and /d/, but he uses that quotation (which I haven't tried to analyze yet) as a way to dismiss those who would "insist on suppressing English speech-habits in this particular connexion" (p. 13-14). I feel like Allen often argues in a way that is kind of biased towards pronouncing Latin in a way that is easy for English speakers (another area where I think he does this is aspiration of voiceless stops), but it doesn't sound like he was aware of any major evidence about this topic. I think the [t̪] [d̪] transcriptions are probably more accurate, but as I mentioned above, I feel like it might be a little inconsistent with the general state of our knowledge about Latin phonetics to use a narrow transcription in this one area (the coronal plosives), when we are vague (probably justifiably so) in some other areas. The symbols [t̪] [d̪] are obviously indispensable in transcriptions of languages like Sanskrit or Hindi that distinguish these plosives from other coronal plosives that have a different place of articulation, but in languages without a contrast, it's not uncommon to just use [t] [d] as a less precise representation of the sounds. The main arguments that I can think of for us to use [t̪] [d̪] are as follows. A), it reminds native speakers of languages with a single series of coronal plosives that are not normally dental (e.g. English) to dentalize their plosives when pronouncing Latin. That might be useful, but it's not really possible for our transcriptions to include reminders for everything that an English speaker might have a tendency to get wrong (e.g. fronting high back vowels, diphthongizing non-low long vowels, reducing unstressed vowels to schwa). B), it reminds native speakers of languages with multiple series of coronal plosives to use dental plosives in Latin. I don't know how many of those actually consult English Wiktionary for Latin pronunciation, but aside from that, that reason seems valid to me. But if B) is a good reason, I think there might also be a good reason to use [c] and [ɟ] (I agree with you about the palatal value; when I said "pre-velar" I was kind of hedging to be closer to the current transcriptions), since there are languages (e.g. Icelandic and Greek) that have multiple series of dorsal plosives. Sure, it's not common for [ki] to exist in contrast with [ci], but that doesn't necessarily prevent the use of transcriptions like [ci] and [ɟi]. Icelandic and Greek both neutralize the contrast in favor of palatals before front vowels, but I think it's usual in transcriptions of these languages to write [ci] rather than [ki]. It could be argued that these languages are transcribed this way for phonemic reasons (because [k] and [c] can contrast elsewhere), but I've also seen [ci] used to transcribe French qui, even though French clearly only has one phonemic dorsal plosive series.--Urszag (talk) 00:22, 10 October 2019 (UTC)Reply

The module should not generate pronunciations with singleton intervocalic /z/ in word-internal position[edit]

For some reason, the module currently doesn't seem to properly show the gemination of word-internal intervocalic /z/ that was a regular feature of Latin pronunciation (taken from the metrical behavior of Greek ζ, the source of the Latin /z/ sound). For example, the entry for Maziris displays the pronunciation as /ˈma.zi.ris/, [ˈma.zɪ.rɪs]; it ought to be /ˈmaz.zi.ris/, [ˈmaz.zɪ.rɪs]. Although the length of the /z/ is predictable, it should be transcribed as double [z.z] in the phonemic transcription because it is not just a phonetic implementation detail: the first syllable of Maziris would scan long in Latin verse, acting as a closed syllable on the level of the phonology. Another entry with an incorrect pronunciation is horizon; we currently show /ˈho.ri.zoːn/, [ˈhɔ.rɪ.zoːn], but it ought to be /hoˈriz.zoːn/, [hɔˈrɪz.zoːn]--Urszag (talk) 04:47, 29 September 2019 (UTC)Reply

I went ahead and added a rule that doubles /z/ between vowels. This seems sufficient to get the desired behavior for Classical Latin. I think a bit more work is needed to get the optimal display of the Ecclesiastical Latin pronunciation. I think the best thing would be to transcribe all inherent geminates in Ecclesiastical Latin as geminate at both the phonological and the phonetic level. This is what seems to be done for Italian, the language that Ecclesiastical pronunciation is based on: our entries for degno and mezzo use the transcriptions /ˈmɛd.d͡zo/ and /ˈdeɲ.ɲo/, not /ˈmɛ.d͡zo/ and /ˈde.ɲo/. However, we currently transcribe the first half of geminate affricates with the symbols for plosives, which I think makes it preferable to use /d͡z/ in all contexts for the Z phoneme (the transcription /d.z/ for long [d͡z] looks weirder to me than /d.d͡z/). I tried to edit the module to do that, but it broke the phonetic code, so I changed it back. I'd appreciate if someone would look it over and figure out what went wrong.--Urszag (talk) 06:02, 4 October 2019 (UTC)Reply

Syllabification and pronunciation of H/aspiration[edit]

(Notifying Fay Freak, Benwing2, JohnC5, Lambiam): I'm not sure what I think is the best practice in this area, but I want to bring up the topic because I feel that the current treatment could perhaps be improved. Right now, we treat all of the following as consonant phonemes: /pʰ/ /tʰ/ /kʰ/ /h/. It's clear that /pʰ/ /tʰ/ /kʰ/ functioned as onsets, but they could be interpreted as sequences starting with /p/ /t/ /k/ (/ph/ /th/ /kh/) instead of as unitary phonemes. I don't know whether it makes much difference which analysis we go with. The situation with /h/ is more problematic, though: we transcribe it as a syllable onset, but it did not in fact behave as an onset for the purposes of Classical Latin prosody. For example, adhibeō is currently transcribed /adˈhi.be.oː/, [ad̪ˈhɪ.be.oː], which suggests that the first syllable is closed and so should scan long: but in fact, syllables with a short vowel followed by a consonant plus H regularly scan short in Latin poetry, implying that they were open, and that the consonant "before the H" functioned as the onset of the following syllable, not as the coda of the preceding syllable. I realize that, given the apparent role of h-loss as a social marker in classical Latin pronunciation, we might not want to transcribe pronunciations with loss of aspiration like /aˈdi.be.oː/ for this period (even though pronunciations without /h/ were most probably widespread). But transcriptions like /adˈhi.be.oː/ don't seem adequate to me. While it's not impossible that the syllabification used in Latin poetry was different from that used in prose, I know of no evidence that would support ever syllabifying /dh/, etc. in Latin as /d.h/, so I think we should transcribe the /d/ as an onset in accordance with the evidence of poetry. I would tentatively suggest using /aˈdhi.be.oː/, with an onset /d/ followed by the aspiration marker /h/, as an artificial representation of the phonological structure of such words (the same for bh, nh, rh, etc). But I have no idea what to give for the phonetic form in these cases. @Brutal Russian, I see that you've represented outright loss of aspiration in your transcription of abhinc as /aˈbink/, [aˈbɪŋk]; do you know of any source that could be cited to support this as a representation of the educated or "prescriptive ideal" Latin speech of the classical era?--Urszag (talk) 13:05, 6 October 2019 (UTC)Reply

@Urszag: The fact is I don't know of any indication that would lead one to believe that Latin had aspiration after any consonant other than p/t/k, and it had it after these for the simple reason that they were lifted from Greek. The grammarians do discuss some words that some apparently pronounced with aspiration intervocalically, and even a couple that we're led to believe everyone did (Terentius Scaurus in the time of Adrian seems to say that about vehō). I don't know if this is a genuine preservation, but it seems more likely to be by analogy with its other velar stems. Here's from Lindsay 1894: "Velius Longus (second cent.), vii. 68. 15 K. gives vemens and reprendo as the usage of the 'elegantiores,' prendo as universal, and Annaeus Cornutus (first cent.', the friend of Persius, who mentions prendo, vemens, nil as the pronunciation of his day". Quintilian says: "The letter h[]was added by our forefathers to give strength and vigour to the pronunciation of many words[]done from their devotion to the Attic language, and under its influence.[]In the same way our ancestors said lachrumae, sepulchrum, ahenum, vehemens, incohare, helluari, hallucinari, honera, honustum. For in all these words there seems to be no reason for that letter, or breathing, except to increase the force and vigour of the sound by adding certain sinews, so to speak." In this light it seems very doubtful to me to have the intervocalic /h/ as a standard phoneme at all - what it seems to have been instead is a social marker. Indeed, the same Scaurus censures the aspiration in pulcher on the grounds "ne una omnino dictio adversus latini sermonis naturam media aspiretur" - this grammarian identifies exactly one Latin word with an aspirated consonant as a standard pronunciation... although media ought to be a voiced consonant. Could he have meant that the /r/ in pulchrum is likewise aspirated?
However widespread the aspiration fad might have been around Cicero's time, there definitely are no words that I know where a written H would be mistakenly added to any consonant other than p/t/k, except perhaps for anhēlāre by false analogy with hālāre. Catullus' oft-cited 84 is likewise about mis-aspiration after voiceless stops or word-intially. I also remember reading in some grammarian that "no voiced consonant is aspirated in Latin", but googling "nulla media aspiratur" didn't produce fruit - it might have been the Scaurus quote, tbh. One can open any historical grammar, from Lindsay 1894 to Sihler 1995 to Weiss 2009 to anything in-between to find plenty of unambiguous evidence that there was no /h/ in words like adhibēre, diribēre, dehinc any more than in nēmō, and that for instance vehemens owes it to vehere while in ahēnum and Ahala it's straight-up orthographic. The only place /h/ was anything like phonemic in CL was word-initially, because in rural Latin it wasn't and rural Latin was the opposite of cool ("urbānum"). I think the burden of proof is on the party that believes otherwise. Brutal Russian (talk) 02:19, 7 October 2019 (UTC)Reply
@Brutal Russian: What did you mean by this part: "there definitely are no words that I know where a written H would be mistakenly added to any consonant other than p/t/k"? When I mentioned bh, nh, rh, I was thinking about words formed from adding a prefix like sub-, in- or per- to a word starting with the letter H. It certainly seems plausible to me that such words typically lacked aspiration in classical times, but the issue I'm worried about is that I don't know of any popular guide to "Classical Latin pronunciation" that explicitly gives that as a rule, rather than just vaguely referencing the instability of the sound and the variability in its distribution. For example, W. Sidney Allen's well-known Vox Latina discusses some of the evidence about the early loss of /h/ in various contexts in Latin, but then finishes up by giving a rule that actually contradicts transcriptions like [aˈbɪŋk] or [aˈdɪ.be.oː]: "The only safe rule for the English reader is to pronounce Latin h as such [i.e. as the English H sound] wherever he finds it in his modern texts (except in humerus, humor, humidus, ahenus, where it is certainly out of place). He will thereby be following, with perhaps even greater consistency than the native speaker, the habits of at least the most literate levels of classical Roman society" (p. 45, 2nd ed.). I don't take Vox Latina as my Bible or anything like that, but I'd imagine that many English-speaking students of Latin will have learned a spelling-based rule like the one that Allen gives—roughly "The most 'correct' pronunciation in classical times was to always pronounce Latin h where written"—and I was wondering how we could explain to such users why our transcriptions don't follow that rule.--Urszag (talk) 03:55, 7 October 2019 (UTC)Reply
Update: I found sources that say that in word-medial position, /h/ was lost "in many words" (Stuart-Smith 2004, page 48; the wording "many" implies that it could have been present in some words) or that it is "doubtful that it was pronounced" (Wallace, page 325). However, these sources aren't very detailed. I'm not fully convinced that the evidence strongly supports a position of saying that /h/ was pronounced word-initially in words spelled with <h> by some Latin speakers in the Classical era, but never pronounced in word-medial position in words spelled with <h>. I haven't seen any source so far that gives a rule restricting the pronunciation of word-initial /h/ based on the identity of the preceding segment; without such a rule, sequences like /bh/ and /dh/ would still arise in phrases like "ab homine" and "ad hostis". If some speakers pronounced /h/ (somehow) in these contexts, it seems like it would not be too challenging for them to use the same pronunciation in words with prepositional prefixes, or in univerbations like "abhinc".--Urszag (talk) 03:55, 10 October 2019 (UTC)Reply
Citations:
@Urszag: >What did you mean.. I meant the fad of (mis)aspirating that Catullus is describing in the 84th poem, and the mentions and evidence of which you can find all over the literature. This phenomenon never occurs with consonants other thank p/t/k. This is only explainable if one assumes that aspiration was impossible with other consonants in Attic and consequently in Latin. Our knowledge of the presence of aspiration is derived from spelling mistakes and direct mentions as well as from our knowledge of Greek - and as I say, I know of no such evidence as regards aspiration occurring after other consonants. The Scaurus quote above is a direct refutation of there being aspirated voiced segments in Latin.
>..rather than just vaguely referencing the instability of the sound and the variability in its distribution Yeah, I don't know of such a guide either. Allen's rule is clearly a simplification for the sake of practicality. As you correctly point out, his is less of a faithful historical reconstruction and more of a practical pronounciation guide for English speakers, and is written as such. "Where it's written" is an extremely shaky rule in principle because the Latin spelling as found in most editions is based largely on that of Carolingian manuscripts and at best gives us a glimpse at post-Augustan grammarian-standardised spelling conventions. The problem of explaining to the users the rules that our transcriptions follow, and their rationales, is a good one to consider. Someone did propose a page with summaries for that purpose, although I wonder if Wikipedia's page isn't a better place for that.
>..never pronounced in word-medial position in words spelled with <h> That's not my position: "The grammarians do discuss some words that some apparently pronounced with aspiration intervocalically, and even a couple that we're led to believe everyone did". One could theoretically compile a list of particular words to whose pronunciation we have references.
>..it seems like it would not be too challenging for them to use the same pronunciation in words with prepositional prefixes, or in univerbations like "abhinc" If what occupied the onset e.g. in /adhūc/ was /d/, then there was either no /h/, or there was an aspirated /dh/ in the language, or /h/ behaved as a liquid after stops. If there were aspirated voiced stops, there would have been reverse misspellings, e.g. dux > *dhux like adhūc~adūc parallel to corōna > chorōna like chorus~corus. If /h/ was able to occupy syllable onset after a voiced consonant like a liquid (even if optionally), it would have resulted in a possibility of heterosyllabification, as with /r/ and /l/ - and note that here the heterosyllabification is mandatory in prefixes: /ob.rū.tus/. What's left is to follow Greek and Roman grammarians in treating it as a feature of the preceding consonant or the following vowel and not a phonemic segment of its own. If one postualtes that this is just a metrical convention, then one might just as well throw the entirety of metrical evidence out the window as not representing any speech facts and easily dismissible (a position that I don't believe is tenable). The apparent fact that it was retained only word-initially, and marginally medially, finds a precise parallel in the Greek digamma. While H being mute in the contexts in which a casual learner might not see any reason it should be is perfectly exemplified by English and Old French. In normal Latin speech it's quite evident that it didn't surface if the onset was already taken, although I can imagine a forced-learned pronunciation where a junction of a voiceless stop and /h/ was realised as an aspirated stop.
There's simply no reason to belive that the -h- in compounds was anything more than an etymological connecting/disambiguating mark - sometimes correctly, e.g. adhibēre to habēre, adhūc to hūc, sometimes not, e.g. anhēlāre to hālāre (itself with a non-etymological onomatopoetic H), and sometimes not connecting at all, e.g. diribēre < habēre and nēmō < homō. And at times a hiatus, or even a long-vowel mark or a stylisation. If we have concrete evidence to its loss in such compounds (rhotacism in diribēre, prevocalic shortening in prohibēre, together with synizesis in dehinc etc.), we must presume it stayed lost unless given evidence to its restoration, which again, I don't believe we do. If we have similar evidence to its loss at the junction with a preceding vocalic segment outside of compounds (elision), then we must assume it behaved identically regardless of how close the syntactic connection was, thus ab homine like ab illō homine like adhibēre like dēbēre - no /h/ in any of those. Thus saying /adhibēre/ or /adhominem/ would be no different from saying /nehemō/ - and again, current spelling is not a reflection of the Classical one and the Romans were aware of the latter one's etymology. Did some overzelous grammarian pronounce both like that? It seems entirely probable to me, one comes across similar anecdotes now and again. But listing this a standard pronunciation is not consistent with all the evidence I'm aware of. Brutal Russian (talk) 00:33, 12 October 2019 (UTC)Reply
There is currently a page Appendix:Latin pronunciation where I think a note about the especially precarious status of /h/ in Latin phonology would not be out of place. I may work on adding that. I don't think we disagree on much; I've just been trying to figure out how specific we can be about this topic. Regarding your point about "current spelling is not a reflection of the Classical one", does that actually apply to the specific word nemo? I've never seen it spelled with the letter "h", and I do think spelling may be important in this area, since the use of /h/ seems to have been a prescriptive ideal rather than something that was acquired as a matter of oral usage.--Urszag (talk) 20:20, 12 October 2019 (UTC)Reply
No, *nehemo is not an attested spelling; which probably was important in rare words that the reader didn't normally encounter in speech, but it doesn't seem likely that it played much of a role in the basic vocabulary. A cockney speaker who pronounces the H in hour won't sound any better than one who only pronounces it in eight, and the fact that it's spelled in one and not the other won't help him. At the same time, while pronouncing it in vehicle would sound just as bad in British English, some speakers of AE do pronounce it like that on forvo. And here's an Austrian speaker even pronouncing the H in stehen - my only explanation for this is that standard German is foreign to her. Pre-modern French took this to an extreme with its H muet/aspiré - predictable more or less on etymological basis, but even then with exceptions, as I remember reading. As for prescriptive ideal, during classical times it was the aspirated pronunciation that was in vogue, while spelling followed only when the writer felt the pronunciation was established enough (pulcher) or supported by enough etymology (vehemens) - it's just that we have to start with the assumption that non-initial H had become silent pre-classically, and was restored selectively, haphazardly and most importantly, non-phonemically, as far as I can see - can anyone suggest minimal pairs with a non-initial H or even an aspirated stop? Brutal Russian (talk) 08:00, 14 October 2019 (UTC)Reply

Open and close e and o in Ecclesiastical[edit]

@Urszag: Is there some source for Ecclesiastical Latin using open and close e and o? At one point when I was editing the IPA help page on Wikipedia, I didn't find anything describing such a distinction, though I recall reading that Italian speakers sometimes make the distinction by analogy with Italian words. It would be strange if such a distinction were prescribed, because it's not marked. At least, aside from ae vs. oe I guess, but those have often been confused. — Eru·tuon 02:56, 7 October 2019 (UTC)Reply

@Erutuon:Oh good, I was just planning to start a discussion on this! There is no reliable source as far as I know that says that Ecclesiastical Latin makes the distinction, but it somehow became established in the Wiktionary project, and if the symbols /ɛ/ /ɔ/ and [ɛ] [ɔ] are used anywhere in our transcriptions of Ecclesiastical Latin, they certainly should be used in stressed syllables that had short vowels or ae in Classical Latin. The prior discussion is here: Template_talk:la-IPA#Ecclesiastical_fixes; you can see that there is a lot of confusion about the use of open and close E and O in "Ecclesiastical Latin", just as there is for the use of the qualities in Italian (I feel like I have read things from many Italian speakers who say that their usage contradicts the supposedly standard usage, or who say that it's pointless for a learner to try to distinguish the two qualities because different regions use them in different words). I'd imagine that later pedagogical efforts to reintroduce the Classical length distinction have jumbled things further (someone on that page suggests that German traditionally followed the Classical lengths, which is wrong, but which may be true for some innovative hybrid German pronunciation systems). The best referenced usage that I know of is to have no phonemic distinction (which would make the transcriptions /e/ /o/ appropriate), but to have a opener rather than a closer quality, which would make [ɛ] [ɔ] better than [e] [o] for the phonetic transcription. That's what Harold Copeman says, as mentioned in the linked thread, although I don't know how accurate it is for modern usage (Copeman also gives the "traditional" excclesiastical pronunciation of xc before e or i as /kʃ/, but that seems to have completely fallen out of use by now in favor of /kstʃ/). If everyone currently active on this project agrees, I would be happy to see the current Ecclesiastical Latin transcription replaced with one that only distinguishes five vowel qualities.--Urszag (talk) 03:16, 7 October 2019 (UTC)Reply
Oh yeah, I forgot about that other discussion. That must be where I heard about Italian Ecclesiastical Latin speakers making the distinction. — Eru·tuon 03:22, 7 October 2019 (UTC)Reply

Is the module supposed to leave out the phonetic transcription when it happens to be the same as the phonemic transcription?[edit]

(Notifying Erutuon, Benwing2): I was confused about why, after I edited to make geminates double in phonological transcriptions, the module stopped showing phonetic transcriptions (in square brackets) on pages like benignus. After comparing it to biennis and annullo, it looks like the module currently doesn't show a phonetic transcription when it would use the same sequence of characters as the phonemic transcription. This is not something that I expected at all; it's a pretty rare situation (it can only occur when the stressed syllable is closed), so I hadn't noticed it as a pattern before, and it just looks like the phonetic transcription is missing. I would prefer for both transcriptions to be shown in this circumstance. Have I accurately described the current behavior? Is it intentional, and do other people agree with my proposal to change it?--Urszag (talk) 07:41, 7 October 2019 (UTC)Reply

That's the intended behavior, as mentioned in #final nasalization again above. I've also found it confusing at various times, so maybe it should be changed.... — Eru·tuon 09:04, 7 October 2019 (UTC)Reply
A few moments laterrr... it's done - granted, with all the recent changes I couldn't even find a word that actually needed this. Brutal Russian (talk) 16:39, 15 April 2021 (UTC)Reply

If I want to add more test cases, is there any way to avoid crowding the page?[edit]

I was thinking of working on the Vulgar Latin code to try to fix various parts of the current behavior that I think are errors (for example, the use of [β] rather than [b] in the current transcription of *septembrius). But I've already added some words there, and I don't know whether adding even more would make the page excessively long. Is there a way of splitting up the test cases into multiple subpages, and if so, would it be a good idea for me to do that for Vulgar Latin textcases?--Urszag (talk) 09:18, 11 October 2019 (UTC)Reply

I found a few modules with multiple testcases pages: Module:hu-pron has Module:hu-pron/testcases and Module:hu-pron/testcases2, and Module:he-translit has Module:he-translit/testcases and Module:he-translit/testcases/special. So those are possible naming schemes. — Eru·tuon 18:22, 12 October 2019 (UTC)Reply
In general I think the more testcases the better. At the moment there are probably lots of things that aren't being tested for yet, but should. — Eru·tuon 18:31, 12 October 2019 (UTC)Reply
I see, thanks! Hmm, compared to that Hungarian page, I guess the page for Latin is actually quite short.--Urszag (talk) 20:04, 12 October 2019 (UTC)Reply

What exactly should the "Vulgar Latin" transcriptions represent?[edit]

I've been trying to look over the "Vulgar Latin" parts of the module, but there are a lot of difficulties that are starting to make me think that it might be a good idea to just abandon the goal of producing automated Vulgar Latin transcriptions. It may be better for each entry for a Vulgar Latin term to get specific attention paid to it. The biggest issue in my view is which modern languages we use as the basis for the reconstruction. The current transcriptions seem to be based in several respects on Western Romance specifically, not all Romance languages. For example, singleton voiceless stops did not become voiced in general south of the La Spezia–Rimini Line. The merger of short i and long e, and short u and long o is also a Western and Italian Romance feature, not a feature of all Romance languages: Sardinian is well-known for merging short i and u into long i and u instead, which implies that at the time Sardinian split from the other Romance languages, Vulgar Latin vowels had not yet merged into a seven-vowel qualitative system (at least not for all speakers). The only developments from the Classical Latin phonological system that are definitely shared between all Romance languages as far as I know are: loss of h, merger of short and long a, loss of word-final "m" in words of more than one syllable, intervocalic lenition of b to v (but I think exceptions might exist for this one), and development of unstressed /e/ and /i/ to /j/ before a vowel. The resulting /tj/ and /kj/ clusters seem to have affricated early enough to be in many languages, even languages like Sardinian that don't show palatalization in general for /k/ before syllabic front vowels, but I'm not sure whether there is any adequate transcription for the resulting affricates, since the Romance reflexes differ.--Urszag (talk) 02:03, 13 October 2019 (UTC)Reply

There can be no one Vulgar Latin transcription because "Vulgar Latin" is not a reconstruction of any single linguistic system. I'm very confused about Wiktionary's treatment of the whole phenomenon - it looks rather like someone's pet conlang project. Brutal Russian (talk) 05:39, 14 October 2019 (UTC)Reply
  • Indeed, quite confusing, even though very useful despite that IMO (and BTW, we could add more dialectal variants, like Afro-Sardinian, Proto-Balkan-Romance, Proto-Gallo-Romance, Proto-Ibero-Romance etc., somewhere in the future). I propose renaming of the "Vulgar Latin" transcription to "Proto-Northwest-Romance" (as per Prof. J. B. de Carvalho) or "Common Romance, NW dialects", or any other unambiguous designation. Ain92 (talk) 16:49, 15 May 2020 (UTC)Reply
  • @Brutal Russian What measures do you propose? Ain92 (talk) 16:49, 15 May 2020 (UTC)Reply
@Ain92 Well, I share a few thoughts on this here - basically since "Vulgar Latin" seems to be largely based on Pompeiian evidence, and since Pompeiian Latin is a variety that's actually attested at least in writing, I think it wouldn't be a terrible idea to extend default pronunciations with it in addition to current Classical and Ecclesiastical. Granted it will have to be closer to Classical than what we currently as Vulgar (e.g. contrastive vowel length was at least as prominent there as in modern Lithuanian or BCS, which mark it). Another variety I'd like to see is circa-Plautine Republican Latin with EI-I and OU-U distinctions (it would be a fun project to find all the forms with original EI and OU), facilius-type words stressed on the first syllable, EQVOS [ˈɛ.kʷɔs], a bold application of the sonus medius and other Republican things that the extant tradition of Classical texts doesn't faithfully reflect. Brutal Russian (talk) 22:00, 15 May 2020 (UTC)Reply
  1. I weakly support this idea of adding Pompeiian by default, seems to make sense.
  2. Circa-Plautine Republican Latin is tentatively not considered Latin but rather Old Latin for by our current consensus as far as I understand it, so unless we don't consider changing it soon (this is of course a matter of convention but a separate discussion is needed to redelineate Latin and Old Latin and/or redesignate the former as Old Latino-Faliscan) you could create a separate Template:itc-ola-IPA anytime whenever you want. Ain92 (talk) 16:46, 16 May 2020 (UTC)Reply
  3. Regarding the @Urszag's points from March, I fully agree with the first argument about incorrect implementation, but the second argument, as @Tom 144 pointed out, actually suggests renaming Vulgar Latin to some less ambiguous name (just as I wrote below yesterday without seeing that discussion).
  4. And regarding your opinion there, I agree that we don't always need a narrow phonetic transcription for the early Romance varieties, rather we could sometimes limit ourselves to less controversial phonemic renderings, e.g. as used in the ongoing Dictionnaire Étymologique Roman project. BTW, that's an option I would like to have for all our "Vulgar Latin" reconstructions, possibly even right in the headwordline).
  5. I also wonder what did you mean by "proto-Florentine"? What period is it supposed to represent? The absolute chronology is of course a matter of intense debate, but is it Imperial Latin, Late Latin ot Medieval? I don't think we need any Romance phonological reconstructions after the finish of our Late Latin (8th century).
Ain92 (talk) 17:08, 16 May 2020 (UTC)Reply
  • On Old Latin, may it R.I.P. Plautus is Old Latin like Victorian English is Old English.
  • On renaming Vulgar Latin and the next point, it should simply dissolve into the reconstructed-namespace Proto-Romance and for attested items, into Latin with stylistic lables as needed to reflect the word's period and non-standard, dialectal, regional etc character.
  • Proto-Florentine is the pre-literary stage of the Romance dialect, so unattested and belongs to Proto-Romance. Everything that reconstructs forms based on comparative Romance evidence should belong to the asterisked Proto-Romance. Late Latin is the attested Latin of some period or other - I don't know the policy here but my assumption was between 3d and 6th c. AD, as in up to Isidore, which is when the Middle ages begin in earnest and Latin gets well and truly fun :-))
  • While ongoing etymological dictionaries argue about the merits and drawbacks of the comparative method and how much attested Latin needs to influence the reconstructions this method results in, the optimal course of action for us, as I see it, is to adopt both approaches separately and simply cross-link the attested (Latin) and the reconstructed (Proto-Romance) entries, and so not have to speculate on the relationship between them or on any interim forms.
  • Relevant reading: Dworkin (‎2016). Do Romanists Need to Reconstruct Proto-Romance? Brutal Russian (talk) 18:31, 16 May 2020 (UTC)Reply
    • Well, I missed that discussion! I disagree with the analogy though: Victorian English postdates orthography standartization, while Plautus predates it by some three centuries. A more correct analogy would be an author from the times of Shakespeare: if one normalizes the orthography, it's understandable by an unprepared contemporary English speaker who have never had any previous exposure to Early Modern English before, but if you ſhow ſame ſpeker yͭ uerie Texte in originall Orthographie yerof, yey wyl likelie refuſe to rede yᷤ Olde Englyſſhe. It's very possible (and I think, quite likely) that by the end of the next century, when the distance to EME grows further, linguists will treat Early and Late Modern English as two different and separate languages like Middle English is treated now.
    • Thanks, I've read that article, but still don't see what is the need to create an additional entity. Reconstruction articles in Latin look well for me, especially considering Proto-Romance and Latin were two registers of the same language (like Old East Slavic is at the same time Proto-East-Slavic even despite all the difference between the high register of the chronicles and real spoken vernaculars). Ain92 (talk) 22:21, 16 May 2020 (UTC)Reply
  • P. S. While reading Wikipedia I ran into an even better example than my EME thought experiment above. "Lo, what should a man in these days now write: eggs or eyren? Certainly it is hard to please every man because of diversity and change of language." — thanks to modernized orthography, only "Lo" and "eyren" (a gloss!) gives away that the text was actually written in not just Early Modern but Middle English, I even uploaded the original to Commons (Loo what ſholde a man in thyſe dayes now wryte egges or eyren/ certaynly it is harde to playſe euery man/ by cauſe of dyuerſite ⁊ chaũge of langage). That's exactly how very real and different Old Latin (attested in epigraphy) composed by Plautus was transformed into what is extant, we only know of Classical Latin "descendants" litterarum Plautinarum! P. P. S. Also, after finding out about the recent separation of Proto-West Germanic from the Proto-Germanic, I think it may actually be reasonable to add Proto-NW Romance as a separate language and bring all our Vulgar Latin reconstructions unattested in Sardinia and Balkans there. Does it sound interesting for you, @Brutal Russian? Ain92 (talk) 00:16, 20 May 2020 (UTC)Reply
  • @Ain92 Lucretius, Cicero, Virgil also predate the standartisation of Latin or its orthography by the same or larger span of time than Plautus predates them, which is 150 years give or take - exactly the same timespan that separates us from late Victorian English, and the linguistic distance is about the same - including spelling. Plautus, Terence, Cato, Lucretius, Cicero, Virgil all used a language and orthographic variety commonly termed Republican Latin, which considerable variation compared to post-Augustan Imperial Latin. Your Early Modern English examples are more comparable to highly regional or even Archaic Latin as found in inscriptions and religious writings that were largely incomprehensible by the time of Cicero. Archaic Latin predates Latin literature and cannot be equated with the language of Plautus, which is studied together with that of Cicero as representing registers and phenomena that otherwise can hardly be studied on Golden Latin evidence (I don't understand this part: "we only know of Classical Latin "descendants" litterarum Plautinarum").
  • The suggestion is not to create any addional entities, but to rename Vulgar Latin, which currently combines properly Latin, sometimes not even substandard attested terms with things like *fabello - by the way is there any other attested language besides Latin that has a reconstructed namespace all of a sudden? I was hoping that the article I linked would demonstrate that the approach that makes the most sense is to treat reconstructions separately from attested Latin because the current research cannot decide on the relationship between the two, while also abusing the term "Vulgar Latin" to mean anything from "variation and change in Latin" to effectively a synonym of "proto-Romance". Up until recently the best attempts tried to speak of the same thing as "Vulgar Latin" when attested and "proto-Romance" when not, but now finally the recognition of the hopeless of the former term is starting to set in, e.g. Kruschwitz, Halla-Aho 2007 as well as introductions to just about any post-2005 work on the topic).
  • When you say that "Proto-Romance and Latin were two registers of the same language", you're both equating it with Vulgar Latin and subscribing to a view explicitly rejected by the compilers of DERom, who don't take any attested Latin as evidence, and which you're certainly aware of after reading the article. I'm less concerned about your reasons for doing this and more concerned with the fact that you seem to want to have reconstructed Latin terms at the same time as being unwilling to create an additional entity and then suggesting to divide the entity we haven't created yet into a yet more narrowly defined entity. I'm afraid I'm completely lost as to the reasoning of this. Do you want to class everything that isn't NW PRom as Latin?...
  • Even if we wanted to subdivide proto-Romance, I don't think there's consensus on how to do that either. Hence my suggestion to make one reconstructed space termed proto-Romance and sort things out within its limits. I don't know what's going on in Germanistics and on here with regards to them, but I don't think there's a need to complicate things which are already as messed up as they are.
  • Here's a different random article just because I remembered: Coleman (1993).
Brutal Russian (talk) 02:08, 28 May 2020 (UTC)Reply
  1. Plautus and Virgil are separated by some 200 years, it's a timespan as large as William Caxton (the pic above) and Daniel Defoe (please check first editions of Robinson Crusoe yourself, it's very similar to contemporary English but not exactly the same). As I've already argued, it's wrong to cite Victorian English as an example since it postdates, not predates orthography standartization which happened in English in early 19th century: a proper analogy for the Victorian English would be early Imperial Latin (and for our English of 2020 would be Late Latin of 3rd century). You seemingly haven't understood my point, why I provided the EME examples: I argue that if you orthographically standardizes written text in an archaic or regional variant (chronolect or dialect resp.) of a language you know, be it Latin or English, the text becomes understandable while the original text was not, even despite the phonetic changes making the idiom almost uncomprehensible for you. As a side note, litterarum Plautinarum was a failed attempt of code-switching to the language I don't know because I wanted to combine "letters wrote by Plautus" and "literature composed by Plautus" in one word.
  2. Re: "by the way is there any other attested language besides Latin that has a reconstructed namespace" — yes, indeed, plenty of languages have both attested and reconstructed entries, that's absolutely normal: Proto-Norse, Old Norse, Old English, Old Frisian, Old Dutch, Old Saxon, OHG, Gothic, Old Armenian, Avestan, Sanskrit etc. Nothing but my laziness prevents me from creating Old East Slavic or Old Czech reconstructed entries.
  3. Will read both articles you linked on the weekend, and may comment the issues of V.L. vs. P.-R. in more details, but in brief, the fact I'm aware of DERom authors' reasoning and partially agree with it doesn't mean that I fully embrace it. They want to compile a strictly comparative dictionary, and I fully support such an effort, but despite that, I don't think we should exactly copy their methodology here in Wiktionary.
  4. If you don't like my Proto-NW Romance suggestion, it's tiresome for me to defend it in English and I'm abandoning it. If you would like to have some talk in Russian somewhere (preferrably a real-time chat or even an online audio call), I would be glad to explain my reasoning behind it on the weekend. Ain92 (talk) 10:53, 28 May 2020 (UTC)Reply
  • @Ain92: Ok, so that comparison wasn't entirely in earnest, but I stand by it. Both Victorian and Modern English postdate standardisation; both Plautus' and Virgil's Latin predate it. The Early Modern English orthographical mess, on the other hand, is completely nuts and cannot earnestly be compared to the orthography of Latin even in the Middle Ages, or frankly of any other language I know - even Old French, since its spelling variation often reflects actual pronunciation differences. Even the most archaising inscription of Plautine times - the SCdB - was in all likelyhood as perfectly legible to Virgil as it is to me. In fact all educated Romans of that period were used to orthographic variation. I compare this variation the difference between how Victorian and Modern English are spelt. The same concerns phonetic changes as Republican orthography was phonetic as far as possible - Plautus would have had an obvious accent to Virgil's ears, but certainly with less differences than between Cockney and RP. Cicero says of this accent: "nōn mihi ōrātōrēs antīquōs, sed messōrēs vidētur imitārī". Here you can hear a very convincing reconstruction of how Naevius' epitaph sounded 200 years before and after Cicero. Whatever variety of Latin you have in mind that seems as illegible to you as that of Caxton, it certainly wasn't how people at Rome spelt in the time of Plautus, even granted we don't know exactly how Plautus himself spelt - assuming you're talking about your own experience in trying to read the Latin, which I'm not sure you are. So you must be thinking of 5c. BC Latin, something like the Duenos Inscription.
  • "we only know of Classical Latin "descendants" litterarum Plautinarum" - what I didn't understand is what is it that we only know of. Now I suspect that you believe that Plautus is only legible to me because his orthography has been modernised. Again, I assure you that this is not the reason. Besides the current manuscript tradition seems to go back to 1c. BC, and this is the orthography you will find in the more faithful modern editions. So even if this wasn't how Plautus spelt, chances are good that the orthography of modern Plautine editons is exactly how Virgil, or someone he knew, spelt.
  • It's not that I'm trying to copy their methodology. It's that unless we want to take the stance that Romance languages descend directly from the linguistic entity that we call Latin here, drawing a clear distinction between reconstructed (PRmc) and attested (Latin) items seems to me to be the only way do dispense with the Vulgar Latin thing, which currently includes now real Latin, now putative reconstructions with no rhyme or reason.
  • Well, not that weekend I suppose, but you can find me at Discord: Unbrutal_Russian#0520.
Brutal Russian (talk) 09:09, 2 June 2020 (UTC)Reply
  1. Your arguments would have convinced me that (Old) Latin of c. 200 BC was mutually comprehensible with (Classical) Latin of c. 50 BC if only I was disagreeing with this thesis in the first place. The comparison with Caxton was an exagerration to illustrate my point (is it what you mean by "wasn't entirely in earnest"? 😉). Actually I have already written above that I hold the view that Plautus is to Latin is what Shakespeare is to English, and Shakespeare in Original Pronunciation (a popular video, an entertaining news article) is perhaps more comprehensible to an American native speaker than e. g. South-West Irish English (a popular video), and indeed here in Wiktionary we consider both idioms flesh of the flesh of English.
  2. Neither I'm going to challenge "that the orthography of modern Plautine editons is exactly how Virgil, or someone he knew, spelt", although I suspect that "exactly" is not an appropriate word here. And even though I disagree that EME orthographical mess is unrivalled, it is out of scope of our current discussion. Now, answering to "what is it that we only know of", the earliest extant manuscript of Plautus is the Ambrosian Palimpsest from 4th-5th c. AD.
  3. "So you must be thinking of 5c. BC Latin, something like the Duenos Inscription" — I agree that this is indeed a good analogy for (Late) Middle English, which is definitely a language separate from Modern English. Currently we have 𐌃𐌖𐌄𐌍𐌏𐌔 redirecting to duenos, which I dislike: I'ld prefer this word being treated under a different L2 and in the original Old Italic orthography with an available romanization just like Gothic. However, to be fair I'm not very eager to do something about that so soon after the "demotion" of Old Latin. If someone reading these lines eventually happens to be interested, here are materials for a catalog of Archaic Latin inscription (unfortunately in Italian) and a related article (large PDF) in English.
  4. Regarding the articles about Vulgar Latin, I found those interesting, thanks, but I still concur with Cuzzolin and Haverling 2009: However, despite these inconveniences, we find it difficult to do without such a well established term as “Vulgar Latin”. These two words are much more conventient than a lengthy formulation glossing the conventional broad umbrella term, and unless there emerges a widespread scholar consensus that the term should be dumped, I'ld prefer Wiktionary doesn't abandon it.
  5. As shown by examples in my previous post, here in Wiktionary we consistently conflate attested and reconstructed forms in a single language, so I do indeed "take the stance that Romance languages descend directly from the linguistic entity that we call Latin here" (to be more precise, a subset of Latin, which doesn't stop to be Latin though). I think we could agree that by the time Balkan Romance split off, its last common ancestor with NW Romance was not yet far enough from Latin attested at the same time to be a separate abstand language, couldn't we?
  6. I had some troubles with joining en-wikt Discord server last year, but I may try it again soon. Yours sincerely, Ain92 (talk) 21:17, 3 June 2020 (UTC)Reply
  • I'm afraid you'll have to dumb down for me what point you were trying to illustrate, because I cannot connect it to my statement that calling Plautus Old Latin is like calling Victorian English Old English. Shakespeare is incomprehensible to unprepared modern English natives as well as to otherwise fluent second-language readers. Plautus on the other hand was as perfectly comprehensible to readers in the 1 cent. BC as the Victorian English of literary plays is to modern English readers. What does half a millenium-old English in whacky orthography, separated by standardisation, has to do with 150 year old Latin separated not even by orthography? You can use Caxton to illustrate how the Duenos Inscription looked to Cicero, but certainly not how Plautus looked to him - and I'm pretty sure you simply confused Plautus with the language of that inscription.
  • Are you familiar with the orthography used in that manuscript? If yes, do you or do you not believe that the difference in orthography in that manuscript from how Plautus actually wrote is in any way comparable to the difference between Caxton's orthography and that of Modern English? If no, what's the point of comparing an unknown with with a known?
  • The Middle English orthography of Chaucer that I've encountered is more similar to Modern English than your Caxton, which is one example of why English is a terrible example to compare anything to. One can go to great lengths to misspell Modern English and it will still represent precisely the same language just as they can modernise Middle English spelling to make it more comprehensible (but never fully). In any case, Duenos Inscription is indeed what people call (Very) Old Latin or Archaic Latin - a pre-literary language. Andronicus, Ennius, Plautus, Cato the Elder is what one calls Early Latin, which might have posed a few difficulties to Augustan-time readers because of its use of forms many of which were already archaic or dialectal for those writers themselves, again, just like Victorian English is to us.
  • As I think I've amply demonstrated, your stance is incompatible with the stance of modern Romanistics, which refers to the parent language as Proto-Romance. This language as reconstructed is markedly different from the attested written variety(ies) known as Latin, and there's no agreement yet on how to bridge that gap. This situation is currently unique to my knowledge and does not apply to those other languages you refer to. As some of the articles I linked point out, it is precisely how that gap is bridged that will have a direct impact on how the comparative method is treated in other disciplines - Romanistics is its primary testing ground at the moment.
  • I would say the recognition that the term should be dumped is evident by this point. I for one can't think of a more arbitrary, loaded and corrupt term in the whole of linguistics. Its usage is being propagated because of practical considerations, but these practical considerations are not ours. We are at freedom to abandon the term, and that is what I propose we do.
  • Please explain what you think is the convenience of the corrupt and undefinable term "Vulgar Latin" after we start referring to reconstructed forms as Proto-Romance with diatopic (geographic) variation, and to attested terms as Latin with appropriate labels to mark diachronic (temporal), diastratic (social and register) and diatopic (geographic) variation.
  • Please describe what follows if I agree or disagree with your Balkan Romance proposition.
  • There's no need to join any servers - you can use Discord for private messages only if you prefer.
Brutal Russian (talk) 10:25, 4 June 2020 (UTC)Reply
I would be very much in favor of replacing 'Vulgar Latin' with Proto-Romance and using (broadly speaking) the DERom for phonemic and narrow transcriptions. The words would be mostly limited to reconstructed forms, then, as befits an abstract proto-language.--Excelsius (talk) 08:06, 1 October 2020 (UTC)Reply
I have just excised all code related to the Vulgar Latin output, but this is just a temporary measure. Upon inspection, there were simply too many problems with it, e.g. when fed *vīsāticum it produced [biˈsa.de.gõ] (shown on the page!), with a very unwarranted [b] (cf. French visage), unwarranted voicing of [t k], mysterious nasalization, and even the [o] is questionable... By all means I welcome the use of DERom's Proto-Romance model instead. @Brutal Russian's proposal of Plautine pronunciations is also good in my books, especially now that "Old Latin" entries are proscribed. It would (will) be added as a named argument, maybe one that takes the IPA shown in some way or other. (I'd like to see a suggestion/proposal of how to do this: separate phonemic vs. phonetic arguments?) I have no real opinion about proliferating post-Proto-Romance dialects as @Ain92 suggested.--Ser be etre shi (talk) 09:28, 30 October 2020 (UTC)Reply

"Ecclesiastical" Latin pronunciations should be provided by default[edit]

Someone else (User:KIeio) said this on Template talk:la-IPA page in 2017 but nobody responded and that talk page seems to be dead, so I'll paste my response to him here too below. I wrote out enough of my reasoning once already, but in short, I don't see a convincing reason why "Ecclesiastical" Latin IPA pronunciation is not shown by default when the module is perfectly capable of generating such forms, and it would be an easy change. I see many convincing reasons why "Ecclesiastical" Latin IPA pronunciation SHOULD be included, and as a reference source for "Latin" (NOT "Classical Latin"), Wiktionary is downright defective and incomplete only showing academically reconstructed 2nd century BC "Classical" pronunciation and not showing any evolution of the language's pronunciation—not showing the ONLY way spoken and sung Latin is actually spoken and sung in today's world, in more Romano ("in the Roman manner, custom, habit") Italianate pronunciation that naturally developed in Rome, the cradle of the Latin language, alongside vernacular Italian languages. No choir or schola cantorum sings Latin in "Classical" pronunciation.
We include 5th Century BC Classical Attic, 1st century AD Koine Egyptian, 4th century AD Koine, 10th century AD Byzantine, and 15th century AD Constantinople IPA for "Ancient Greek" (5 different pronunciation evolutions for ANCIENT Greek as late as AD 1452) but not 15th century Rome for Latin, in fact no Rome of any period beyond ONE vague "Classical" 2nd-1st century BC reconstruction; by default the template currently shows only the ONE modern reconstruction of what academics think high-register "Classical" Latin sounded like c. 2nd-1st centuries BC, for a language with usage history spanning over 2,500 years. We should also consider showing Vulgar Latin—also already in module! And maybe in the future potentially add Archaic pre-Classical Latin.
Latin did not die with the Classical period. The "Ecclesiastical" Late Latin of St. Jerome's Vulgate c. 4th-5th centuries AD is not "Classical" but is undeniably Latin, and Medieval Vulgar Latin was still a distinct language till at least 8th-9th centuries before the Proto-Romance languages began to develop and diverge—whither Late Latin IPA pronunciation would provide a useful bridge.
But right now easily changing to automatically include Ecclesiastical IPA underneath Classical IPA seems an incredibly obvious and easy improvement to make; there's no good argument against it, not even a close call....

I see this was never answered, and la-IPA hasn't been updated since 2017, seems abandoned, but I concur with this comment, Latin IPA should be changed ASAP to include the Ecclesiastical pronunciation by default, unless someone can provide a convincing reason to NOT provide both pronunciations. Possibly also include Vulgar Latin pronunciation, although maybe should get more specific in description for that first, as is the case with the Ancient Greek. (Specify for example "5th century CE Late Latin" or "9th century CE Medieval Latin"). The shifts in vernacular pronunciation from "Classical" toward medieval Vulgar, Medieval, Ecclesiastical (and then later Proto-/Old Romance languages) had begun as early as the 1st century AD/CE when the Roman Empire was at its peak, just as Classical Greek of 400 BC Athens had become Koine by 100 BC Alexandria. To only provide the (speculatively RECONSTRUCTED) high-register literary Latin pronunciation of Cicero of c. 3rd-1st centuries BC is misleading and clearly flawed. Especially when compared with other old ancient-medieval-modern evolving languages, the lack of any pronunciation besides the "Classical" becomes a glaring absence.
As KIeio mentions, the Template:grc-IPA automatically provides not one, not two, but five different pronunciations across nearly 2,000 years each specifically described (5th Century BCE Classical Attic; 1st century CE Koine Egyptian (Alexandrian), 4th century CE Koine, 10th century CE Byzantine Medieval, 15th century CE Constantinopolitan Medieval), 3 shown by default (Classical, Koine, Medieval), just click drop-down to see all 5. Often the later Medieval pronunciations resemble and provide a useful bridge to Modern Greek pronunciation. Even Phoenician Punic, an ancient language the pronunciation of which we know far less about, Template:xpu-ipa-rows automatically provides 3 evolving pronunciations (6th century BCE Punic, 2nd century BCE Late Punic, 2nd century CE Neo-Punic), that likewise serves as a useful bridge to Hebrew and other Semitic languages. Hebrew words can have many IPA pronunciations (e.g. Biblical, Tiberian, Ashkenazi, Sephardi, Yemenite, Modern Israeli)...for 2,500+ year-old language Latin whence eventually evolved Italian and other Romance tongues, we have only ONE IPA pronunciation from BC era, as if Latin ceased to exist as spoken tongue 2,019 years ago.
The fact is, the "Classical" pronunciation is reconstructed by best guess of how upper-class Roman Republic citizens like Cicero spoke in Rome during a narrow specific period about 2,200-2,000 years ago. Nobody has ever heard anyone speak Latin like the "Classical" pronunciation, it was not passed down in that form but rather a modern academic reconstruction. Insofar as Latin has remained a living spoken language, it resembled what we call "Ecclesiastical" pronunciation, closer to the Romance languages which evolved from Latin—obviously most closely the Italic Romance languages of Italy whence the Latin language originated in the first place. Medieval Latin was kept alive all over Europe as a language of scholarship and ceremony, but in particular spoken and chanted Latin naturally passed down as the active liturgical language of the Latin-rite Roman Catholic Church based in Roma, Latium. The Pope of the Roman Church assumed the Classical-era Latin title of Pontifex Maximus; the Latin of the Roman Church has been the standard form of living Latin since the fall of the Roman Empire. Until 1969 every Roman Catholic Mass on earth was said, chanted, and sung in Latin—in "Ecclesiastical" pronunciation—and today many Roman Catholics still attend pre-1969 "Tridentine" Latin Mass rather than vernacular service. The Vatican, the only state where Latin is an official language, still produces a Modern Latin lexicon. As a matter of practical concern, anyone who sings in a schola or choir and who looks up any Latin words on Wiktionary to know how to pronounce them (one of few areas where verbal Latin is still regularly used by people of all backgrounds) will find the "Classical" pronunciation useless, or even worse, confuse and embarrass them if they lack foreknowledge about "Classical" vs. "Ecclesiastical" pronunciation differences, or even unaware that there is a difference at all. Wiktionary is useless or worse for most people looking up Latin pronunciations to SAY or SING, to actually put to practical use, unless the person knows about the different pronunciations and knows to edit the source page to add "eccl=yes" to show the Ecclesiastical pronunciation that should have been showing by default.
One more thing may be noted: the "Ecclesiastical" pronunciation was not merely the Latin used in Roman Catholic Church. In fact by the 19th century the liturgical Latin in churches of different countries had become corrupted by influence of local vernaculars. This prompted Pope Pius X in 1912 and Pope Pius XI in 1928 to call upon the universal Roman Catholic Church to purify their Latin pronunciation and conform to the pure living Latin of Rome, Latin pronounced more Romano ("in the Roman style"), i.e. the Italianate pronunciation of Latin that developed naturally in Rome alongside the distinct vernacular Italian Romance languages—which we now call "Ecclesiastical" Latin pronunciation.
While not widely spoken as vernacular, the situation of "Ecclesiastical" Latin vs. "Classical" Latin is most akin to the development of Classical Ancient Greek to Byzantine Koine/Medieval Greek and onward to Modern Greek spoken by Greeks in Greece. 15th century CE Constantinopolitan Medieval Greek pronunciation is still here categorized as a pronunciation evolution of "Ancient Greek", a distinct language from Modern Greek, although the pronunciation is far closer to Modern Greek then to Classical Attic Ancient Greek. As Constantinople was the capital of the Greek world until 15th century, so Rome was and still is the capital of the Latin world. The way Latin pronunciation evolved in Rome to Italianate "Ecclesiastical" pronunciation should be considered a natural evolution of Latin pronunciation (an evolution into Late Latin that had begun as early as the first few centuries AD) just as Classical Attic Ancient Greek pronunciation naturally evolved into 300 AD Koine and eventually into 1452 AD Byzantine Ancient Greek which is pronounced like Modern Greek (as Ecclesiastical Latin is pronounced similar to Modern Italian). But Wiktionary classifies the language of 1452 Constantinople as "Ancient Greek". With "Latin", we don't even have that slightly misleading "Ancient" label in the name of the language to argue against including post-Cicero Ancient Latin IPA.
So again I strongly concur on making an easy and quick change to automatically include Ecclesiastical Latin pronunciation with the la-IPA template to appear under the Classical pronunciation, 2 pronunciations for a language in use for 2,500+ years or so is hardly unusual; having only ONE pronunciation based on reconstructed 2nd century BC literary Latin is downright defective. The Wiktionary language is LATIN, not CLASSICAL LATIN. Unless someone can provide a convincing argument to the contrary, that's a simple change that should be made immediately. Further down the road we should consider more significant reforms of Latin IPA template to mirror that of e.g. Ancient Greek and Punic in showing historical evolution of pronunciation. Inqvisitor (talk) 19:32, 9 November 2019 (UTC)Reply

equus should be /ˈe.kus/ instead of /ˈe.kʷus/[edit]

Was reading Latin SE and stumbled upon this discussion. The boukólos rule still applied to the Classical Latin, and there shouldn't be any labiovelars before /u/; Velius Longius even wrote in the 2nd c. AD: "Indeed, to the ears it is fine to write equus with a single U, but the mind demands two". Ain92 (talk) 16:32, 15 May 2020 (UTC)Reply

Thanks for reminding about this, I had fixed the entry itself (until some kind soul reverted it) and intended to fix the template but forgot :P I'm not quite sure what to do with the Ecclesiastical one because no Ecclesiastical guide seems to mention this, and all the Italians I've heard consistently pronounce these words with two separate u-vowels parallel to their realiastion of /i.i/ as Lilianuccia here - Italian, like Latin, bans /wu/ (which lends some credit to giorgiospizzi's attempt, hehe). So the current transcription seems doubly wrong. [u.u] would be descriptively correct, but I have no idea about the syllabification prescribed in Ecclesiastical. If it's one syllable, then we have a descriptive-prescriptive probelm, because that just doesn't happen for Italians. Brutal Russian (talk) 20:48, 15 May 2020 (UTC)Reply

Enable disabling Ecclesiastical[edit]

Now that Ecclesiastical is automatic, it also can't be disabled, unlike Classical. I cannot into coding so can please someone fix this - I've missed the option on several occasions. Brutal Russian (talk) 17:43, 2 June 2020 (UTC)Reply

Realisation of /ll/[edit]

I noticed recently that wiktionary was transcribing Latin /ll/ as [llʲ] and have traced the phenomenon back to this edit by Brutal Russian (talkcontribs). May I ask what source points towards Classical Latin /ll/ being realized in this specific way? I understand that there are some hints of this in the Spanish outcome /ʎ/, and the /ɖɖ/ seen in Sardinian, Sicilian, Neapolitan - and possibly historical Gascon as well - may be related. (Based on this evidence, two of the sources cited on Proto-Romance language posit /ll/ [ɭɭ].) I have taken a look at Allen's comments (Vox Latina, pp. 33-4) and he makes no mention of a palatal realization for /ll/. It seems to me that it'd be safer to simply have the realization as [ll], for Classical Latin at least. --Excelsius (talk) 07:51, 1 October 2020 (UTC)Reply

@Excelsius It's been discussed right here. Here's an absolutely masterly thesis on the issue that I hadn't mentioned: Müller, Daniela (2011). Developments of the lateral in occitan dialects and their romance and crosslinguistic context. It's very long and involved, so you might want to skip to chapter 5. Long story short, the issue remains unresolved. The current transcription is mainly based on the fact that all the grammarians claim the same realisation for the double /l/ as before front vowels, where evidence for palatalisation is good. But now that you've mentioned [ɭ] and I've listened to some examples from Dravidian, I've realised that that seems to be exactly how I pronounce it in Latin before back vowels (e.g. nūllus). It's semi-conscious, as I was attempting to approximate a possible precursor to the Roman retroflexion, but I hadn't realised the result until now - pretty excited about that! In addition, the sporadic /d > l/ interchange in Latin (and Sabellic) make more sense with a retroflex realisation than with a palatalised one. The fact that, apparently, in Telugu /ɭ/ > /l/ but /ɭɭ/ remains is also interesting to note. To be clear, the palatalised /l/ is as in French, Arabic, Turkish and Russian (the latter two contrasting it with the velar one).
Personally my take on this (taking into account Sen's and Müller's treatments as well as everything else I've read) is in terms of tense/lax or strengthened/weakened reinterpretation of /ll/ vs /l/: both the singleton and the geminate originally had two allophones depending on the backness of the following vowel. /l/ was originally velarised and dentalised before back vowels (as in the pre-consonantal position), palatalised before front ones. When it was later reinterpreted as the weak allophone of /l~ll/, only one of the original allophones was chosen; /ll/ underwent the reverse by strengthening, resulting in either further retroflexion or further palatalisation (cf. Müller's remarks on strengthening/weakening passim). It seems that some varieties chose opposite outcomes for both, while Northern Italian eventually merged them, at least in quality. So the velarised and palatalised allophones of /l/ originally corresponded to retroflex and palatalised /ll/. Now to me, the retroflex [ɭ], although backed, perceptibly resembles palatalisation and not at all velarisation, which could be why the Romans didn't notice this subphonemic difference in the geminate. Most do mention a type of "barbarism" called labdacism, which took several shapes all of which boil down to a misuse of the allophones in both the singleton and the geminate - I take this to be a precursor to the disparate Romance developments (as described above). Thus for example [ɖ] must have been generalised from the back-vowel environent (cf. Müller's remarks on the synchronic variation e.g. in Corsican). Do you think it makes sense to introduce a rule to transcribe /ll/ as retroflex before back vowels? Brutal Russian (talk) 19:27, 1 October 2020 (UTC)Reply
@Brutal Russian My take on all this is that l exilis stood for both for palatal and retroflex l (phonemically /li/_V and /ll/ respectively), as these sounds can be somewhat similar; the remaining l-allophones would then be dental and velar. Müller (2011)— very interesting thesis, by the way, thank you for sharing— describes the retroflex geminate scenario as 'quite likely' (p. 183), although it seems that in her view this was a secondary development from an earlier [ɫɫ] stage. Then there is the view of Sen et al. that l exilis was /li/ [lʲ] and /ll/ [llʲ], with the remaining allophones of /l/ being [l] or [ɫ] depending on the following sound. Then there is the 'mixed' scenario, such as you have outlined above, which I find plausible as well. Quot homines, tot sententiae.
It seems difficult to choose which one of these explanations is both likeliest on theoretical grounds and the best fit for various comments by Roman grammarians. (And since said comments are scattered over several centuries, diachronic change confronts us with yet another variable to consider.) It is for these reasons that I think that a simple /ll/ [ll] may be the safest option for the purposes of Wiktionary. Not because I consider that particular realisation to be likely, or even plausible, but because it represents the neutral/'agnostic' position.
If we must tackle the problem of l-allophones somehow, perhaps we can find some sort of 'compromise' model. One solution could be to use some diacritic to indicate the tense-lax or fortis-lenis distinction between l pinguis and l exilis, without further elaboration, and leave more detailed discussions to the talk page for now.--Excelsius (talk) 05:37, 2 October 2020 (UTC)Reply
@Excelsius Again, transcribing both /ll/ and /l/ _V[-back/-front] as [l] is against the oldest testimony (that of Pliny), who distinguishes L /_V[-back/-front] from L /_V[+front] and L[+long], as well as L /_V[+back], which is most straightforwardly understood as [l] [lʲ] [ɫ], and is understood so by most. The same allophone for /_V[+front] and [+long] is also visible in the fact that while other geminate consonants all simplify after a long vowel, /ll/ doesn't, but before a following /i/ in hiatus it first exhibits variation in spelling and then settles on a singleton: mīlle but mīllia~mīlia (foreshadowing the /ll~lj/ merger of Iberian Romance?). The three degrees are also necessitated by the backing of e > [u]/_[+cons] but e > [o] /_V[+back], e.g. vel- > volō, vult. Later grammarians describe what seems to be the extension of the "exīlis" to the word-initial position as well, a distribution not confirmed for any variety with retroflexion of /ll/, which points against interpreting "exīlis" as "retroflex", at least canonically. If these retroflexing varieties passed through a [ɫɫ] stage, this was definitely the labdacism of the grammarians and definitely not standard. I would consider the clear L of Northern Italian as the reflex of this via de-velarisation (a development already confirmed in the pre-consonantal position). Personally I don't find ɫ > ɭ to be plausible because the two sounds are extremely perceptively dissimlar. Intuitively, I see retroflexion as likely developing either as palatalisation+backing (as in my own speech), or after prior affrication, or simply a merger of the contact-assimilated /rl/ > [ɭɭ] with /ll/due to the abundance of r-final forms (likely also from /sl/ with a retracted apical [s̺]. Unless you take the "medius" allophone to be something other than [l], I don't see a justifiable way to transcribe /ll/ as the same. Brutal Russian (talk) 18:41, 2 October 2020 (UTC)Reply
@Brutal Russian I said that I don't consider [ll] to be plausible. It's merely an ‘agnostic’ transcription that would avoid a need to specify something that we don't have a clear understanding of. In this scenario, every other allophone of /l/ would also be transcribed that way.
For millia/milia, note that— at least by the time of Proto-Romance— Latin /li/_V appears to have geminated. So, assuming that /lli/_V had the same outcome (consider e.g. the derivatives of malleus, or Gallia > Jaille in a French toponym), the ⟨li⟩ in milia could simply reflect the pronunciation [llʲ] or [ʎʎ]. L would be a singleton only in the orthographic sense.
Incidentally, the failure of /ll/ to merge with /li/_V in most of Romance is another objection I have to interpreting the former as [llʲ].
According to Müller (2011, p. 187) only one Roman grammarian claimed that /l/ was exilis in word-initial position. This is in reference to Consentius (floruit saeculo V), who writes: “exilius autem proferenda est, ubicumque ab ea uerbum incipit, ut in lepore lana lupo”. Judging by the rest of his comments, I think he was simply using exilis indifferently to mean ‘not pinguis’. (He only defines l-allophones in these two terms, unlike the ‘triplex’ model some others use.) Perhaps Consentius felt that word-initial l and geminate l were more similar to each other than either was to l pinguis.
In any case, the existence of an l-medius in word-initial position is relevant. Had that sound generalised by Consentius' time, such that it occurred before all vowels rather than just /e/? Then again, Pliny never (explicitly) said that initial l was medius only before /e/ in his day.
If we accept that l-medius was indeed spreading to new environments, that might explain why Capella assigns it to the word-final position as well, in clear contradiction to Pliny. And indeed, this change matches the evidence from Romance. Admittedly, Capella's other claim that /l/ "numquam ulli semivocali vel mutae praeponitur" is rather strange.
Overall the picture is confused at best, and scholars more knowledgeable than you or I have yet to make sense of it all.
Earlier I proposed a tense-lax diacritic solution; I still think that it would be a good compromise. For instance, we would have (for Classical Latin):
/ˈɡal.lus/ - [ˈɡal͈.l͈ʊs]
/ˈlaː.mi.na/ - [ˈl͉aː.mɪ.na]
Let me know what you think.--Excelsius (talk) 21:10, 2 October 2020 (UTC)Reply
@Excelsius Despite the IPA having its major limitations and skewed generalisations, an IPA transcription cannot properly be "agnostic" - for that we have the phonemic transcription. [l] is not an archi-transcription standing for all kinds of lateral consonants, although it does stand for an arbitrarily-chosen range of these sounds, the main factor seemingly that they're all allophones in English. But Wiktionary uses diacritics in its transcriptions, so an unmarked [l] stands specifically for an alveolar lateral approximant. Since someone's decided that we know enough about a never-heard language to narrowly transcribe it, we have to make decisions as to the most plausible realisation of the phonemes. Again, a plain alveolar lateral approximant is not the most plausible realisation of /ll/, but a palatalised one is, with a likely retroflex allophone before back vowels (esp. high back).
I'm not aware of any evidence suggesting that /li/ merged with /ll/ at any point throughout Late Latin - such as a graphical confusion between the two. The variation I mentioned is late-Republican, when i > j /_V was in its very inception, and the merger requires it be be completed first. The failure of that merger is no more an obstacle than the /cl, fl/ developing into /cj, fj/ is a reason to disagree with Priscian's account of Pliny's "pinguis" allophone in these clusters, or than the singleton /l/ before front vowels (exīlis) developing identically to the other two kinds of L is a reason to reject the grammarians' tradition. The liquids were already being reshuffled when they were writing, yet this first-hand account of the allophones of L (albeit in a citation), to my mind, is singular in its precision in the entirety of Latin phonology. I don't know of another Latin phoneme whose allophones are pinpointed and distinguished with the same precision and that fit as well with the linguistic data because the description makes so much sense to modern linguists, and that by the same people who couldn't even grasp the concept of a voice contrast. Which is why I don't understand the reason for your comments about us not having a clear understanding. If we don't have a clear understanding of this phoneme's allophones, then we have no understanding whatsoever of the allophones of any other Classical Latin phoneme.
Many more grammarians than just Consentius call the initial L "exīlis". Note that the latter says "exīlius" comparatively in relation to "pinguis", which is why I think your explanation is correct. I also think Capella's account is due to his Āfricānitās. Still, the likewise-African Donatus says "labdacismi fiunt, si aut unum <l> tenuius dicis, ut Lucius, aut geminum pinguius, ut Metellus". I don't know enough about the grammarians to say which one was more likely to parrot a tradition, except that it definitely wasn't Consentius, a self-professed descriptivist and a favourite of many for this reason.
The medius before /e/ is based primarily on linguistic evidence (see Sen or any historical treatment of L), and only confirmed by Pliny's examples. Personally I don't think either that there's a lack of agreement between the scholars, or that I lack any particular information to make sense of the issue myself, even if I don't have anything close to Sen's knowledge of segmental phonology, or Müller's comparativist data and knowledge of articulatory phonetics. Here's two more opinions to show that there's no ambiguity about it among informed scholars, one from Cser 2016:
"There is compelling evidence that [l] was strongly velarised in coda position, and it is highly probable that it was somewhat palatalised in gemination and before [i]. In onset position it was, in all likelihood, velarised before all vowels except [i]."
and another from Weiss 2009 p.82 (as well as in several other places): "In Latin, *l developed two allophones: a non-velar (possibly palatal) allophone called exīlis be­fore i and when geminate...and a velar allophone called l pinguis elsewhere", adding in a note "The one slight surprise in this distribution is the fact that l is pinguis even before e, e.g., Herculēs < Hercolēs".
I've never seen either the lax or the tense diacritic used in IPA transcription, and I don't understand what you mean by that. Obviously some languages are best described in terms of tense and lax phonemes, most typically stops, but Latin is not one of these languages. When I used the phrase "tense/lax or strengthened/weakened reinterpretation of /ll/ vs /l/", what I meant was that Western Romance seems to have had developed this system when it simplified the geminates and voiced the singletons, at which point it reinterpreted /l/ and /ll/ in terms of this feature contrast. This system might survive in modern Spanish and especially in Sardinian, in an especially archaic and confused form. But it would take a lot of well-supported reasoning to demonstrate first that such a feature contrast existed on the phonemic level in late Republican/early Imperial Latin that we're transcribing, and then that the lateral belonged to a class of consonants that participated in that contrast. Even if you managed to do achieve that, you'd still have to explain to me the use of the diacritics on the level of narrow/phonetic transcription, because right now after a quick Google and seeing no hits apart from one Old Irish phonemic transcription, I don't think it stands for anything definite or will make any sense to the reader. Brutal Russian (talk) 01:32, 3 October 2020 (UTC)Reply
@Brutal Russian Narrow transcriptions don't have to be precise regarding the realisation of each and every phoneme. For instance, not a single one of wiki's narrow transcriptions of ⟨can't⟩ indicate any degree of vowel nasalisation. Similarly, two of the three American recordings of ⟨caught⟩ are transcribed with [t], despite the sound being a glottal stop. The degree of precision one uses depends on what features one wants to focus on. (E.g. the ⟨caught⟩ transcriptions emphasize vowel differences.)
Considering that [l] could have been one of the allophones of Latin /l/, if we accept the identification of (at least some cases of) L-medius as such, there isn't necessarily any anglo-centric bias in using the same symbol to transcribe other allophones as well.
The diacritics under l͈ and l͉ are the official, albeit seldom used, indicators of fortis and lenis (see e.g. here). Ettmayer, at least, finds 'tense' and 'lax' a suitable match for the Roman terms exilis and pinguis. I'll have to see what exactly he makes of L-medius.
To account for the general non-merger of /ll/ and /li/_V we would have to posit that the former, despite starting as [llʲ], ended up as some other sound before /li/_V > *[lj] > etc. began in earnest. Considering that 'yodification' is evident already in Classical Latin verse, at the very least as an incipient phenomenon, the timing is difficult. Not impossible, mind you- but that is unquestionably a point in favor of [ɭɭ].
When I said that 'we lack a clear understanding,' I was referring specifically to the realisation of /ll/, not singleton /l/.
What other Roman grammarian claims that word-initial /l/ was exilis, or at least exilior than another allophone?
Wiktionary currently transcribes /l/ before /e(ː)/ as a clear [l]; that is in contradiction to Sen, Cser, Sihler, and Weiss - all of whom maintain that /l/ was dark before any vowel but /i(ː)/. How was that schema decided?--Excelsius (talk) 20:29, 3 October 2020 (UTC)Reply
@Excelsius Wrong transcriptions must be corrected, not used as justification for more wrong transcriptions. What we're focusing on is precisely distinguishing the allophones of L. If the medius /l/ is [l], then transcribing /ll/ as [ll] contradicts Pliny's singular (and singularly Classical-age) description as well as data from pre-literary Latin. Your next suggestion makes as much sense to me as transcribing all instances of GA English /t/ and /d/ as [ɾ] because it's one of their allophones. One can only argue for an indifferent narrow/phonetic transcription from acoustic/articulatory similarity. Indifference for allophony is the specialty of broad/phonemic transcription.
I've asked you to "explain to me the use of the diacritics on the level of narrow/phonetic transcription". It's now clear to me that you're failing to grasp the distinction between phonetic and phonemic transcription in principle - I urge you to read up on the difference, we cannot have a coherent discussion until you do. The diacritics meaningfully stand for fortis and lenis in phonemic transcriptions - which I had mentioned and instantly dismissed as not pertinent to Latin. Again, they don't seem to be meaningfully used for /l/. All of the preceding is irrelevant to my question. Let me approach it differently and assert that a statement like "[l] is phonetically fortis or lenis" (= narrowly transcribing it as such) is a linguistically vacuous statement and describes no phonetic reality. There's no feature set, articulation specification or fundamental frequencies it corresponds to. You cannot pronounce these transcriptions or point me to an audio recording. You're a priori transcribing nothing, and this is symptomatic of your general push in this exchange.
I'm entertaining the possibility that /ll/ was universally [ɭɭ], and even that /li/ was [ɭi], but the evidence so far is scarce. Your doubts about the timing are based on the assumption that a contrast between [llʲ] and [lj] is difficult to maintain, for which I see no good justification. The merger needs [lj], but [lj] doesn't need the merger. A little imagination suffices to conceive of [lj] as the starting point of a push-chain of shifts in the liquids, the various labdacisms of de-palatalisation, velarisation, and stop-retroflexion. All the same, Catalan and Italian are happily maintaining a /ʎ(ː)/ — /lj/ contrast (same lack of palatalisation in Catalan as in Czech, which also used to have the palatal /l/ of Slovak). At this point it makes sense to exchange more recordings to understand what each of us is even talking about - I bet we wouldn't even agree on or indeed distinguish the palatalised and retroflex /l/ - but I don't think this is the place to do it. I was hoping the mention of French, Arabic, Turkish and Russian would take care of that.
I can't seem to find other mentions of word-initial exīlis besides Capella and Consentius. Perhaps the fullness of their descriptions gave me the impressions I'd read more of the same. I definitely did read several modern treatments arguing for the spread of exīlis to word-initial position. They're mentioned in the doc I link below.
l > [l] /_e(ː) is based mainly on the first thing I linked in this discussion, Sen 2015, as the most detailed study, and is far from contradicting it. No wonder we're talking past each other - please take your time to at the very least scroll through the tables and note how many allophones he distinguishes and in what environments. It's available on LibGen if GoogleBooks is being uncooperative. Even without that, I've stressed several times already that Pliny distinguishes 3 degrees and that linguistic evidence speaks for at least the same number, it worries me that you're still not understanding what I'm saying. Granted, the example I gave is actually not pertinent to the /e(:)/ environment, but to this from Sihler 1995 p.174: "On the evidence of the vowel changes, l pinguis actually had two degrees of avoirdupois, being fatter before a consonant than before a vowel, such that *welō → volō but *weltes → voltis → vultis." To put it in other words, Sihler too distinguishes 3 degrees on the palatalised-velarised axis. I decided that distinguishing two different velarised allophones would be difficult excessive, though it now strikes me that it could be done as [aˠɫ]. Another difference of the current transcription from Sen is that although he writes [llʲ], he explicitly suggests that palatalisation was "anchored at end of first half of geminate".
Here's yet another account (p. 70, Zago A. (2017), Labdacism: a vitium 'from the provinces'?) of the same facts as the others reaching largely the same conclusions as the others, including me (complete with questioning Adams' treatment of the topic, whose interpretations I consider to be quite off the mark). And a bonus summary table and primary sources summary. Everyone well-informed and living in this millenium seems to be in happy agreement.
As a final word I ask that you consider whether you're actually interested in improving the transcription. When you're suggesting transcriptions devoid of meaning, arguing that we shouldn't aim for precision because some transcriptions are just wrong, to me it seems like simply muddying the waters while milking me for knowledge/conviction points. One could theoretically take any Latin phoneme and do the same thing you're doing. I know full well that transcribing a dead language is filled with uncertainties. I want to see these uncertainties removed through positive evidence. The current transcription is according to the best of my knowledge, and I want that to improve, not regress. If you can provide any more support for retroflexion and, for instance, examples (preferrably in studies) of languages where it occurs in palatal environment or i-colours preceding vowels, I'll be very happy. If you can find a study as rigorous as Sen's that comes to different conclusions, that'd be amazing too. Unfortunately, our current discussion doesn't make me very happy, it only reminds me of the time when I didn't know much about Latin phonology (or phonology in general) and everything was uncertain to me. Not to mention that I'm spending way too much effort writing these replies while not getting much in return (oh well, at least I'll have my own notes). In fact, let me ask you directly: had you read anything on the topic besides Vox Latina before starting this exchange? Brutal Russian (talk) 12:28, 4 October 2020 (UTC)Reply
@Brutal Russian I have shown to you that narrow transcriptions do not have to transcribe every single articulatory feature. In fact, if they did, they would be so swamped with different diacritics that they'd become rather difficult to read. Instead of admitting this basic fact, you have decided to attempt to lecture me on the differences between broad and narrow transcriptions. Perhaps it's time that you yourself read up on the definition of the latter.
I'm baffled by your objection to using fortis and lenis in narrow transcriptions when the wiki page that I just showed you does exactly the same thing. If you'd bothered to perform a google search on the matter, you'd have easily found more examples.
With these diacritics we do lose a degree of precision in transcribing singleton /l/; we also- on the other hand- avoid over-precision in transcribing /ll/, which I consider a far greater sin. If you feel otherwise, that is simply a difference of opinion— not fact.
Regarding [ɭɭ]: there is substantial evidence to support this realisation, certainly much more than there is for a supposed /ll/ [llʲ]. Data comes from the entire southern half of Italy, as well as Sardinia, as well as the Cantabrian-Pyrenean-Aquitanian bloc. Müller herself, as mentioned, considers it "likely", and it is not- unless I have missed something- incompatible in any way with the evidence Sen adduces for /l(l)/-allophones.
Speaking of Sen: yes, since you mentioned him I have read through the entirety of his work on the subject. He specifically says /l/ was dark before /e/, as do all the others I mentioned. I cannot believe that I apparently have to prove this, given that he says so passim, but here is one example. For someone who is so against any imprecision in transcribing allophones of /l/, it's curious that you appear to have no problem with transcribing dark l as [l].
The fact that e.g. Modern Catalan has a /ʎ/ alongside /lj/ is not strictly relevant; the latter of the two developed long after the former. The timing presents no difficulties at all. If you imagine, in Latin, that /li/_V becoming [lj], etc. (a process which, let's not forget, had already begun by Classical times) lead to the de-palatalisation of your supposed [llʲ], what sound do you imagine resulted from the latter? Simple [ll], perhaps? Surely not; you have condemned it in no uncertain terms. [ɭɭ]? That'd be a humorous way of partially admitting that I'm right. In any case you would have to defend the supposed earlier [llʲ] without any Romance data whatsoever.
The reason that you "can't find other mentions of word-initial exīlis besides Capella and Consentius" is that they do not exist. By the way, Capella does not call L exilis in word-initial position; go re-read his quote. As I said, and as you have brazenly denied, Consentius is the only one to have done so.
I wrote the wiki pages on Proto-Romance and the Reichenau Glosses and have been a latinist for over fifteen years. What qualifications, may I ask, entitle you to such a cartoonish level of arrogance?--Excelsius (talk) 19:07, 4 October 2020 (UTC)Reply
The wiki page shows examples of the diacritic, it doesn't define the phonetic value of l͈ / l͉. Your links contain no phonetic or phonological research. The second link is talking about vowels (funny). Two only discuss the phonemic status of stops - the first link talks about phonemic and "traditional terms" and further narrowly transcribes lenis as voiceless stops. Only the third link attempts to attribute an articulatory basis to the terms and even has you pronounce a fortis and lenis [l]. At the same time it calls them "general terms which describe the impression of the listener" and says they are "rarely used by linguists today", which is my impression as well. You simply don't understand what you're even looking at and are throwing links at me under the assumption that I won't either.
Here's what a phonetic research paper on the actual realisation of the fortis-lenis phonemic contrast looks like (this sort of research I don't understand that well myself, but at least I'm at a stage where I can recognise this). The conclusion is relevant: "Thus there is no evidence from these preliminary data to support the notion of a single independent phonetic correlate of the fortis/lenis distinction in these languages."
*sigh* What Sen describes as "dark" he transcribes as [l°], the medius allophone which occurs before /e(:)/, has no secondary articulation and is 2 on the scale of 1-4 where 1 "clear" is the most palatalised and 4 "darkest" the most velarised. This sound is IPA [l], and that is how we transcribe it. We transcribe allophone 3 "darker" as [ɫ] - this was chosen because most (all?) other treatments, and all grammarians, group it together with 4 "darkest". You seem to be confused by the term "dark", since what is usually called "the dark L" is [ɫ], which in Sen is 2 degrees darker.
Now that I know who you are, I'm unfortunately not surprised. I've talked to you under three different nicknames now, and every time has been the same. I don't think I can change this result no matter how hard I try. We will have to stop this conversation. Brutal Russian (talk) 09:33, 6 October 2020 (UTC)Reply
@Brutal Russian Asking for a specific articulatory value for fortis and lenis demonstrates that you simply don't understand what the diacritics are for.
What difference, exactly, does it make whether the diacritics are used for a vowel or consonant? The point is that they are- in fact- used in narrow transcriptions. Your pointing out that they are not in widespread use is redundant, as I had already told you that a few messages prior. (Remember the words "albeit seldom used"?)
Explaining the obvious - that /l/ can have different degrees of 'darkness' - doesn't change the fact that your transcription fails to denote *any* dark quality at all in /l/ before /e/, which is wrong according to your own sources. You are left with the following two choices:
1) Correct your transcription with diacritics. Trivial enough, surely.
2) Admit that some degree of imprecision is acceptable in a narrow transcription - particularly for a dead language- and recant your vehement rants against the very notion.--Excelsius (talk) 17:44, 8 October 2020 (UTC)Reply

Intervocalic s in Ecclesiastical Latin—not as simple as "always [z]"[edit]

I wanted to bring this up again as I don't feel like there was sufficient discussion above when [z] was added to our "Ecclesiastical Latin" transcriptions. Contrary to Andrew Sheedy's experience of seeing Ecclesiastical Latin intervocalic s consistently described as [z], I've seen at least four sources that say that it is not pronounced [z] (or at least, not like English /z/, which is probably how the majority of our readers will assume [z] should be pronounced), and it seems to be easy to find more where these come from:

I don't think it's necessary for us to try to indicate the strange exception Crow mentions "for borrowed words". Honestly, because the difference between [s] and [z] in Ecclesiastical Latin is not contrastive and is described differently (and rarely comprehensively) by different sources, my own preference would be just to leave it out. (I'm skeptical that it's correct to show [z] in compounds such as acrisepalus and lociservator).) However, if we don't just use [s] everywhere, I would prefer to transcribe the difference as [s] vs. [s̬], with a voicing diacritic rather than the use of separate symbols for the voiced and voiceless variants. Aside from the voicing diacritic allowing for the possibility of partial voicing, as some guides imprecisely imply (with wording like "softened somewhat"), [s̬] might sort of work as a compromise transcription that comes closer to acknowledging the always-voiceless-[s] system that the above sources indicate exists. I'm also inclined to think that we should transcribe the voiced variant before a voiced consonant (when there is no preceding voiceless consonant) as well as between two vowels. None of the sources I've seen say whether a voiced or voiceless variant is used here, but since Italians use voiced [z] in Italian words like smeraldo and organismo, I think [z] would also be used in the Italian pronunciation of Latin words like smaragdus and organismus.--Urszag (talk) 05:57, 31 October 2020 (UTC)Reply

I'm against appealing to conflicting evidence and editors' uncertainty in order to justify confusing the reader and noncommittal, ambiguous, non-transparent transcription. There's enough of this already, and wiktionary aims at clarity, correctness and scientific rigour. It's not like there can be a lack of authoritative Italian or indeed Latin sources on this, so I don't understand using obscure unreferenced English websites. If I had to guess, these claims come from the emphasis on not voicing the final -s since this is what English speakers default to. That or it's been lifted wholesale from Reconstructed. Modern Standard Italian seems to have moved towards completely eliminating exceptionally unvoiced intervocalic /s/ such as in cosa, rósa during the last century, but as far as I'm aware these only existed in native words, whereas all Latin borrowings had always been consistently voiced.
After checking a couple of articles, it seems there is actually a distinction between modern-Italianising pronunciation, and properly Roman. Rome is in the not-s-voicing territory, and this seems to have extended to its Latin. How about finally giving in and gloriously multiplying our Latin (mis-)pronunciations for each major place and historical period? xDD Short of that, I'd stick with Roman - but I would like some Italian sources for the Roman voicelessness before moving on. As for voicing assimilation of /s/, it exists in nearly all Romance varieties and should definitely be included - btw also for Classical, except before sonorants unlike in Italian. Brutal Russian (talk) 09:39, 31 October 2020 (UTC)Reply
Thank you for the response! I agree that ambiguous transcriptions are undesirable. If you are familiar with authoritative sources, I'd appreciate a pointer! I've had trouble finding complete and authoritative sources about "ecclesiastical Latin" pronunciation. I'll look for an Italian source about voiceless intervocalic [s] in Ecclesiastical/Roman Latin.
Regressive voicing assimilation between consonants is something I've just been thinking about recently. The pattern that you suggested for /s/ (regressive voicing assimilation before a voiced obstruent, but not before a sonorant) is certainly phonologically natural, but I'm not sure there is much evidence for that specific pattern of voicing being used in Latin. If I understand correctly, this system of voicing is used in Polish and a number of other Slavic languages (where /s/ and /z/ are separate phonemes that become neutralized word-finally or before an obstruent). However, I cannot find any Romance language where that pattern applies to voiced and voiceless allophones of /s/: allophonic voicing to [z] seems to apply before sonorants just as much as before voiced obstruents in the accents of Spanish and Italian (and various related languages of Spain and Italy) that show regressive voicing assimilation of /s/. (French lost /s/ in this context, which doesn't really tell us much.)
Since original sequences of /s/ + sonorant or voiced obstruent in Latin were eliminated by the loss of /s/, words with these sequences mainly consist of transparently complex formations or borrowings from Greek. Modern Greek has [z] in words spelled with σμ, and on Wikipedia, we show voicing assimilation here as going back to "5th BCE Attic" (e.g. our entries for πρίσμα (prísma), φάσμα (phásma), φάντασμα (phántasma), βαπτισμός (baptismós), σμάραγδος (smáragdos) alongside Θίσβη (Thísbē)). So unless there are sources that say that such words are pronounced with specifically voiceless [sm] in some variety of Latin—unlike what we see in Greek and in Romance—it seems simpler to me to assume that voicing applies equally to sm-words like prisma and sb-words like Thisbe.--Urszag (talk) 21:30, 31 October 2020 (UTC)Reply

Tapped R; Retracted S[edit]

Discussion/comments on these two changes. The tapped R is self-explanatory, yet I've encountered a surprising number of people, mainly native English speakers, who were entirely oblivious to it. The opposition /r/-/rr/ as tap vs trill is the crosslinguistic norm, no reason to doubt it for Latin. Now also tap in complex onsets, but heterosyllabic cases like abripiō are still trilled - is there a shorter/more elegant way to code this btw? References for the retracted S (which I did not specify further as apical or laminal for lack of evidence) are these:

  • Adams D.Q. (1975). The Distribution of Retracted Sibilants in Medieval Europe - scihubbable
  • Vijūnas A. (2010). The Proto-Indo-European Sibilant S
  • Widdison K.A. (1987). 16th Century Spanish Sibilant Reordering-Reasons for Divergence
    • I have no problem with the transcription of /s/ as retracted, but I'm less in agreement with transcribing a two-way contrast between a trill allophone [r] and a tap allophone [ɾ] for the phoneme /r/. It seems to be debated in some languages whether the usual realization of singleton intervocalic /r/ is better described as a tap or as a single- or double-contact trill (see the note in w:Italian phonology#Consonants). Transcription of Italian terms on Wiktionary do not seem to generally make use of [ɾ]; e.g. abbronzare, acciughero. It goes without saying that Latin /r/ would have corresponded to a range of phones, some more common than others in certain contexts, and it's very likely that not all of these phones were trills, but I don't think there's sufficient evidence to reconstruct a conditioned allophonic dichotomy between a trill and a tap allophone, rather than allophony that included free variation between these and possibly other realizations such as an approximant or fricative. There also isn't much evidence that I know of about which allophones would be more common in different specific environments. In Standard Spanish, all word-initial "r" is fortis (even after a vowel), but in Standard Italian, I believe the pronunciation of word-initial /r/ in intervocalic position is not obviously different from the pronunciation of singleton word-medial /r/; Morrocan Judeo-Spanish apparently shows a tap in phrases like la reina. Crosslinguistically, there doesn't seem to be consistency about when post-consonantal rhotics are realized (or at least, described) as trills or taps: Łukasz Stolarski (2015) says that the Polish rhotic is usually a tap after either a tautosyllabic or heterosyllabic consonant. I found a paper by Travis G. Bradley (2001) says that Basque rhotics are neutralized to trill, not to tap, when not intervocalic (giving [prantses] *[pɾantses] as one example), although I'm somewhat skeptical about the phonetic accuracy of this description ("An acoustic description of Mixean Basque", by Egurtzegi1 and Carignan, 2020, says "the most common realization of onset-cluster rhotics involves one tap (60.7% of all items, and 35.7% with no taps)"). --Urszag (talk) 04:48, 22 November 2020 (UTC)Reply
      • @Urszag Sorry for replying late, I'm just not sure how to approach this, so let me explain my reasoning. Whether there's a phonetic or articulatory difference between a single-contact trill and a tap (or a flap for that matter, which to me is clearly different from both) is not my concern in this case. Instead what I'm trying to do is introduce some sort of notation that would inform the dictionary's user about the allophony that most likely existed, and about the relatively more appropriate articulation. It's important to observe that languages such as Spanish and Basque do not distinguish single and geminate consonants at all. What they have instead is two different rhotic phonemes that are consistenly contrasted word-internally, and maintain their identity on word boundaries - in Spanish, a word-final tap /ɾ/ followed by a vowel-initial word usually remains a tap, but a word-initial trill /r/ always remains a trill. Basque, on the other hand, has both underlying trills and taps word-finally, visible e.g. after suffixation. Both the trill and the tap syllabify as onsets, which is just one clear indication that no gemination is involved. In absolute final position the contrast is neutralised in both languages, with both single and multiple contact realisations possible in Spanish (p. 120). See here p. 183 for more details and summaries.
      • I'm hoping that the transcription I've implemented will allow speakers of multi-rhotic languages like Spanish and Basque, single-rhotic geminateless ones like Japanese or English, as well as those that possess geminates (Italian, Arabic, Turkic), to opt for the most appropriate articulation. The former group must be told to use the tap in order to successfully imitate the Italian intervocalic R, because their [tera] closely maps to the Italian geminate; the second group is likely to not know about the distinction at all (again, as I've found out repeatedly), and must be clearly informed about it (here's what might happen if you don't xDDD). Since the IPA offers no way to represent a mono/bivibrant trill, [ɾ] is the next best choice. Finally, the speakers of languages possessing geminate-rhotics are already set: they hear the Spanish/Basque tap as their single intervocalic /r/, and the reverse also holds for Spanish speakers. Since no language I know of contrasts taps with mono/bivibrant trills, which one Latin had is simply irrelevant.
      • The only language I'm aware of that consistently realises the singleton intervocalic /r/ as a multivibrant trill (perceptually ~3 vibrations) is Finnish. This sits well with the fact that this language apparently has no branching onsets (3 different sources mention this), and word-initial CR clusters are forbidden (no such native words exist). Compare that to Spanish with its always tapped, tentatively always tautosyllabic /Cɾ/, whereas heterosyllabic postconsonantal /r/ must be a trill. The Latin variable syllabification is a perfect match for this variation - which is impossible in Spanish. Among other things, Latin borrowings into Basque have precisely the same distribution as their reflexes in Spanish, and so appear to be Latin borrowings into Albanian.
      • Theoretically speaking, I know of exactly one case to class the tap separately from the trill: Portuguese, which apparently has the Iberian-type two-rhotic system in which only the trill can be uvular/velar or an alveolar approximant, but the tap doesn't seem to allow the same allophony. Neither have I ever heard a Spanish speaker with a uvular instead of a tap. Meanwhile in languages like Italian and Russian, uvular trill realisations are common and extend to the single rhotic in all positions. This however belongs to phonology; phonetically the single intervocalic /r/ of Italian and Russian are monovibrants and for practical purposes the same as the Iberian tap, and you only need to browse forvo to confirm that. Brutal Russian (talk) 12:20, 27 November 2020 (UTC)Reply
        • Here's a stimulating article I'd totally forgotten about, also quite pertinent to the question of /L/ (the book is libgennable). In it the author only talks about a flap and doesn't even seem to mention the trill. I would add that the the Oscan pervasive anaptyxis both in onset (patereí "patrī") and coda (aragetud "argentum") points also to a tap, even in coda. See also what's termed svarabhakti in Spanish (and elsewhere, accompanying the tap). Brutal Russian (talk) 13:15, 27 November 2020 (UTC)Reply
          • @Brutal Russian Would you find it acceptable to unconditionally transcribe /r/ as [ɾ] and /rr/ as [rː]? I would prefer that over trying without any direct evidence to indicate when singleton /r/ may have been more likely to be realized as [r] vs. [ɾ]. If we're transcribing singleton intervocalic /r/ as [ɾ], and treating that symbol as potentially encompassing monovibrant and bivibrant trill realizations, I don't think there's any strong reason not to also transcribe singleton /r/ as [ɾ] when it occurs word-initially or after a coda consonant. If the reason for avoiding [r] in transcriptions like virīlis [wɪˈɾiː.lʲɪs̠] is for fear that Spanish speakers might mistakenly pronounce it in a way that sounds like [wɪrˈriː.lʲɪs̠] /wirriːlis/, I think it would make equal sense to avoid word-initial [r] in the transcription of words like rēs in case Spanish speakers mistakenly pronounce phrases like ūna rēs in a way that sounds like [ˈuː.narˈreːs̠] /uːnarreːs/. There seem to be a few academic sources that describe a tap as a likely pronunciation of Latin /r/ without any special reference to word-internal intervocalic position: aside from the article you linked to, there is "Although Classical Latin had only one rhotic, presumably the tap [ɾ], Spanish strengthened this tap to the trill /r/ in word-initial (e.g. rojo ‘red’) and postconsonantal syllable-initial positions (e.g. alrededor ‘around’, Enrique ‘Henry’)" ("Syncope in Spanish and Portuguese: The Diachrony of Hispano-Romance Phonotactics", Eric Adler Lief, August 2006, 2.3.1, page 61).--Urszag (talk) 05:36, 28 November 2020 (UTC)Reply
            • @Urszag Actually I do object to this, on the following grounds. Latin has one rhotic whose unconditioned realisation, judging by all the descriptions, was a trill (littera canīna etc). The tap is the conditioned allophone when in the phonologically weak position - being in onset preceded by a vowel - including branching onsets. In other positions the strong, trill realisation should be assumed - see Lai (2013) p.54+ for phonological details. Again, with rhotics, this is the crosslinguistically default strategy to ensure the length contrast. And again, this distribution explains the variable syllabification. What follows is that word-initially it's to be transcribed as a trill. Precisely this is reflected by:
              1. Italian except the Northern dialects, which seemingly preserves the Latin system intact;
              2. The entire Iberian Romance, which has converted it to a two-rhotic system phonemically, but phonetically basically preserves the Latin distribution, minus the sandhi (could be preserved by Mosarabic);
              3. Basque and Albanian reflect Latin with the trill word-initially, tap in intervocalic and branching onset, and word-finally I have no relevant examples; in Albanian there are sporadic curiosities such as this: tmerr. In Basque, the word-initial trill induced a prothetic vowel like in native words and in #4 (errege) - there are no initial rhotics.
              4. In Gascon (as well as some Spanish varieties), South Sardinian (again Lai (2013) p. 80-81 and elsewhere) and proto-Aromanian (the language's very name for example), the word-initial rhotic induced a prothetic vowel. This probably also goes back to an Iberian-type two-rhotic system - Gascon and Spanish from Basque with the same initial rhotic ban; the Sardinian consonant system is fortis-lenis, a mirror of Old Spanish (Paleosardinian = Vasconic confirmed?? :P); in proto-Aromanian the two rhotics might have been borrowed from the likes of proto-Albanian, but then collapsed again. The Sardinian prothesis is attested in the first documents from the 11-12th centuries, although some varieties show rhotic-initial prothetic vowels followed by a tap. In the remaining varieties svarabhakti is usual (Frigeni 2009 p. 143).
              5. All of this clearly points to the same strong/weak allophony as presented in Lai and still visible in Italian. Notice that this is entirely separate from a fortis/lenis system typical of Iberia: the Iberians interpreted the Latin rhotic's allophony in terms of their fortis/lenis just like they conflated the Latin voiced/voiceless and single/geminate 4-way contrast in the same binary terms. These systems seem to be complementary, with a one-directional diachronic tendency.
              6. What about the Oscan examples? These were to demonstrate that the tap must have been a possible and current realisation within a few dozen kilometers from Latium. Oscan had no Latin-type rhotacism, which might very well show a sort of differentiation of allophony in the two languages: while Latin incorporated more realisations (fricatives, approximants) into its rhotic, Oscan seemingly moved towards more taps - but notice no prothesis word-initially! I want to speculate that it was actually an anti-fricative, anti-trill cacuminal flap to pair with what eventually happened to the Latin /ll/ in the same territory. But then there's South Picene, which spells word-initial (?) *kl-, *pl- as QDVÍV́ (= clueō), PDVFEM - but it also likely had the Oscan-style anaptyxis, while still spelling *r as R. So if the latter was a tap, D might have instead been the same fricative as the Umbrian Ř (RS in Latin script) from *d intervocalically and *l before palatal vowels, suggesting it used to be a lateral - Zamponi (2019) for a quick language overview; also Wallace (2007).
            • Using such a transcription to help Spanish speakers would be like transcribing the word-final M as [n] lest, say, Portuguese or French speakers forget to turn it into an [n] when a dental consonant follows (I just heard this two days ago). There is only so much we can do without sacrificing the accuracy of our transcriptions. Most readers will have to refer to some external explanation of Latin sandhi, and I don't believe it's the job of single-word, absolute phonetic transcriptions to teach it to them. In any case, if a Spanish speaker successfully acquires the system of consonant gemination (which to them sound like "very fortis" natively), I would expect them to unconsciously readjust their rhotic system in line with it. Brutal Russian (talk) 09:55, 28 November 2020 (UTC)Reply