Module talk:grc-translit

From Wiktionary, the free dictionary
Latest comment: 7 years ago by Erutuon in topic Unreadable diacritics
Jump to navigation Jump to search

Accents in transliterations[edit]

Why are they ignored? --Ivan Štambuk (talk) 21:17, 16 October 2013 (UTC)Reply

Probably because this has been standard practice for some time. As one of the fiercest champions of the "no accents in Ancient Greek transliterations" school, I'll give you my standard spiel. Simply put, transliterations on Wiktionary are meant to give a quick and dirty approximation of the pronunciation of a non-Latin-script word to the uninitiated. We have a number of more precise means at our disposal, such as the pronunciation sections of entries, as well as the actual scripts, which always always always precede the transliteration, but the transliterations are handy for people who don't care enough to learn the script in question or click the link and go to word's entry. Additionally, all of this translit jargon (goofy diacritics, numbers, non-standard letters) don't really mean anything to the people they're supposedly there to help. The only ones who understand any of it are the people entering it in the first place, namely the people who already know what the script in question sounds like. Essentially, highly technical transliterations on Wiktionary (to say nothing of other contexts where the original scripts might not be available) distracts and confuses the audience which actually uses them, simply to mollify that which does not. I'm all for circle jerks, but not in public. -Atelaes λάλει ἐμοί 04:38, 27 October 2013 (UTC)Reply
Thanks! --Ivan Štambuk (talk) 04:43, 27 October 2013 (UTC)Reply

Capitalization of digraphs.[edit]

There are several Greek letters that we transliterate using digraphs, such as th and ps. When these letters appear capitalized, we capitalize just the first letter of the digraph: Th, Ps. I think this gives wrong results for all-caps text. So, two questions:

  • Do we ever have all-caps text? (If not, then of course this doesn't matter.)
  • Can the behavior be improved algorithmically? That is, can we autodetect whether Th or TH is more appropriate in a given transliteration?

RuakhTALK 18:29, 27 October 2013 (UTC)Reply

I can't think of any situation where it has arisen thus far. We've generally just followed the practice of modern Greek and Ancient Greek dictionaries, which is to prefer lowercase, and capitalize the initial letter with proper nouns, sentence beginnings, etc. That being said, Ancient Greek was majuscule only until the early medieval period, if memory serves correctly, and it might be nice to be able to represent all-caps words without the transliteration going crazy. If Lua lacks a capacity to inherently distinguish between upper and lower case letters, we could simply create a list with all the caps in it, then tell it to do TH if the second Greek letter is uppercase. -Atelaes λάλει ἐμοί 19:20, 27 October 2013 (UTC)Reply

H between vowels[edit]

I happened upon the word Μῶἁ (Môha), listed in the Alternative forms section of Μοῦσᾰ (Moûsa). Unfortunately, due to an h-transposition process in the transliteration module, it gets transliterated as Mhôa. Could something be done to fix this? I guess the transposition process is to correctly transliterate stuff like οἷος (hoîos), which would be ohios otherwise. Maybe the transposition could be made to happen only when the preceding letter is a vowel, or something? [edit:] Oops, that doesn't make sense. Only when the preceding vowel is at the beginning of a word? Or even easier: only if the rough breathing is on υ, ι (u, i). — Eru·tuon 23:25, 2 October 2016 (UTC)Reply

Pinging: ObsequiousNewt, CodeCat. I think I fixed the problem. Μοἁ is no longer transliterated as Mhoa, and other transpositions seem to happen as they should. — Eru·tuon 00:14, 4 October 2016 (UTC)Reply

Hmm, if there are more words with this odd Laconian h, there may be cases where the code generates the wrong output – for instance, if a word has υὑ ιἱ (uhu ihi) – but those can be dealt with as (or if) they pop up. — Eru·tuon 00:17, 4 October 2016 (UTC)Reply

@Erutuon: This is by far the strangest phenomenon I've seen. Lenition of intervocalic /s/ is attested in Laconian, Argolic, Elean, and Cyprian. But in inscriptions it's spelled with Η, if at all, and it's never transcribed with the spiritus asper, but rather with the Latin letter h. According to Buck "Examples of σ omitted are also in Ar.Lys. and in glosses", the latter of which probably means Hesychius or similar. LSJ lists three that I can find: Μῶἁ, πάἁ (line 995), σάἁμον (IG V(1) 364). I really want to find the source at least for the last one, as I have no idea why an inscription would actually have the spiritus asper on it. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 00:39, 4 October 2016 (UTC)Reply
@Erutuon, ObsequiousNewt: I believe the recent changes may have caused ᾱ̔δομαι (hādomai) at *sweh₂d- to stop displaying correctly. —JohnC5 15:22, 5 October 2016 (UTC)Reply
@JohnC5, Erutuon: Leave this one to me, please. I'm going to rewrite the function Better. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 19:32, 5 October 2016 (UTC)Reply
@ObsequiousNewt: Thanks! Could we add some testcases to this module? —JohnC5 21:58, 5 October 2016 (UTC)Reply
@ObsequiousNewt: I didn't know there was a decomposition function... Cool! — Eru·tuon 01:05, 6 October 2016 (UTC)Reply
@Erutuon: I didn't know about it either. —JohnC5 02:46, 6 October 2016 (UTC)Reply
@ObsequiousNewt: Should υἱός (huiós) be huiós as mentioned in Wiktionary:Ancient_Greek_transliteration? —JohnC5 17:37, 6 October 2016 (UTC)Reply
@JohnC5: I previewed *sweh₂d- using my old version of the module, and sure enough, it messed things up. I don't understand why... oh, I suppose it was because the rough breathing was a combining diacritic, and thus was placed after the vowel. Oh well. — Eru·tuon 01:11, 6 October 2016 (UTC)Reply

Unreadable diacritics[edit]

I was trying to replace literal diacritics with variables to make the code more readable, but for some reason my edit here made ἕπομαι (hépomai) be transliterated as eh́pomai. For some reason, matching '̔' (combining dasia) gave a different result than matching DASIA (= mw.ustring.char(0x314)). When I replaced one with the other, it worked again. Not sure why this would be. — Eru·tuon 21:39, 7 January 2017 (UTC)Reply