Wiktionary talk:Persian transliteration

From Wiktionary, the free dictionary
Latest comment: 11 months ago by Atitarev in topic Standardization of Classical + Dari Persian
Jump to navigation Jump to search

"ā" or "â"?[edit]

ằ Is this official, i.e. should I be changing all the "ā"s to "â"s ? --Ivan Štambuk 19:07, 2 April 2008 (UTC)Reply

I don't think so. I raised the issue of transliteration when I started using this site, but the discussion was kind of abandoned. However Persian entries still only number around two and a half thousand, so it's not a disaster that it's not yet standardised. I think it's right to say that Dijan uses ā/š/kh/ž/č and I think Stephen G. Brown uses ā/š/x/ž/č. I use â/sh/kh/zh/ch. My own opinion is that sh/kh/zh/ch is the most user-friendly and more common, and that š/kh/ž/č (or š/x/ž/č) is a more academic style. As you know, both systems have advantages and disadvantages. There does need to be a decision between the two. Pistachio 19:24, 2 April 2008 (UTC)Reply
Regarding ā and â, I believe that â is actually marginally more common overall. Pistachio 19:32, 2 April 2008 (UTC)Reply
Yesterday I saw a Persian entry that simultaneously used both 'â' and 'ā' in transliteration, and that is kind of..very wrong, for obvious reasons. Macron is almost universally used in most of the languages of the world for transliterating/transcribing long vowels, circumflex is kind of Persian-only thing.
As for the 'sh' vs 'š' and similar - I'd personally prefer the latter, and generally the single letter vs. a corresponding digraphs which should somehow "approximate" phonetic value of the transliterated character in English-familiar terms. There's a ===Pronunciation=== section to accommodate for those unfamiliar with Persian phonology; we shouldn't be bastardizing transliteration scheme instead.
I suggest someone knowledgeable put this to vote at some time. Also would be great if someone could expand this with examples for stress notation and hyphen-separation for prefixes and enclitics in transliterations. --Ivan Štambuk 16:44, 3 April 2008 (UTC)serves the learnerReply
It is counter-productive to use 'š', a letter which is not familiar to native English speakers. Pistachio 19:25, 3 April 2008 (UTC)Reply
And I might add that it's a bit bloody rude to come in here and accuse me of 'bastardising', no less, the transliteration scheme after all the hundreds of hours of my own time I have spent trying to add entries in Persian. Pistachio 19:36, 3 April 2008 (UTC)Reply
I'm sorry if it felt rude to you, but I didn't accuse of anything, nor tried to invalidate your contributions here. Two important points must be made clear:
  • Macron is generally really much more prevalent for indicating vowel length.
  • There are numerous transliterations schemes for other languages here on Wiktionary using obscure characters such as 'ṅ', 'ś' or 'ě' which average English user (for a definition of "average user", browse WT:FEED) will probably never encounter in his life. Confining oneself to the world of 7-bit ASCII or Latin-1 in a Unicode-aware Wiktionary might be exaggarating.
Here is a PDF comparing various transliteration schemes. I suggest this put to vote ASAP, because it looks silly that the three persons contributing to Persian entries have been for 2+ years using each his own's mutually incompatible scheme. --Ivan Štambuk 11:19, 4 April 2008 (UTC)Reply

References[edit]

Does the current version correspond to an established system? Can someone provide a reference?

Discussing the requirement for references at Wiktionary:Beer parlour#Transliteration appendices. —Michael Z. 20:10, 12 April 2008 (UTC)Reply

Revival of discussion[edit]

The discussion regarding transliteration schemes and the usage thereof needs to be revived, transliteration being an important part of a dictionary containing entries in an alien script. There are some points I would like to bring forth, each of them representing my personal views:

  • A 1:1 relation between Persian and Latin letters is preferable, for a number of reasons:
    • Gemination is more clearly represented, i.e. čč instead of chch.
    • Ambiguity problems are resolved, i.e. ž can only mean ژ, as opposed to zh which could mean زه or ژ.
    • It is more consistent and easy to read.
  • Capitalisation of proper names, initial words etc. in transliterated text should always be avoided. Capitalisation is a concept absent in Persian and therefore it should be left out.
  • Homophonic letters should be transliterated differently, i.e. ظ ز ض ذ should all have their own Latin counterparts, which is not the case in many of the transliteration schemes employed on Wiktionary. Transliteration schemes should be as bijective as possible.

Much more can be said on this subject; these are some starting points. ✎ HannesP · talk 21:51, 5 January 2012 (UTC)Reply

The above suggestions incorporates a mix of transliteration standards, especially diacritics employed by non-English systems such as French and German. I vote that we adopt one that confirms to an English standard, such as ALA-LC or IJMES. I am very puzzled over why غ must be rendered as ğ. This is confusing as it is also a modern Turkish yumuşak ğ. The use of the caron is also primarily employed in non-English systems like the DGH, ALA-LC and IJMES do not use it, ever. What does this clarify much for an English speaker? IMHO, not much.Jemiljan (talk) 16:27, 30 September 2016 (UTC)Reply
I think this issue has pretty much been handled. Except for capitalization, which I use strictly as an aid in reading transliterated text. — [Ric Laurent]16:27, 6 January 2012 (UTC)Reply
Further revival, regarding capitalisation, not just Persian: Wiktionary:Beer_parlour/2014/January#Capital_letters_in_transliterations_of_languages_that_do_not_have_capital_letters. --Anatoli (обсудить/вклад) 23:32, 2 February 2014 (UTC)Reply

New Persian Romanization System[edit]

Inconsistencies[edit]

The purpose of any Romanisation/ transliteration system is to correctly identify and distinguish different letters. These systems were first developed by scholars and library cataloguers in an effort to systematise how non-Latin letters rendered in Latin script. Furthermore, transliteration systems are not, strictly speaking, phonetic systems. The purpose is to convert written letters, and not necessarily an attempt to specify their exact sounds (which can change with a dialect anyway).

Under the current standard, I observe four different Persian letters, ز, ذ, ض and ظ are all transliterated as simply "z" without diacritics to distinguish them. Also س، ص, and ث are rendered as simply "s". These are phonetic standards that do not in any way distinguish between the original letters. Hence, this Wiktionary "standard" fails to meet the most basic and essential criteria of a transliteration system, which is to distinguish between letters written in a non-Latin script.

For this reason, I vote that additional diacriticals be added that follow an accepted language standard.

Compare with the system developed for Arabic Transliteration. While my own preference is against the use of the caron/hacek for it follows primarily non-English European transliteration standards, I do see that some have argued against the use of digraphs like kh, zh, gh for specific letters. Yet for Arabic, ǧ is used for ج when j is perfectly acceptable, and here we have ğ for غ.

So, beome suggestions to clarify and improve the current standard:

س = s

ث = ṯ

ص = ṣ

ز = z

ذ = ẕ

ظ = ż

ض = ẓ Jemiljan (talk) 18:21, 30 September 2016 (UTC)Reply

1.) The purpose of transliteration/transcription may be to represent spelling or to represent pronunciation. Both are equally justified. Your claim that the former alone is "the most basic and essential criterium" is uncandid. The question is what is useful in a given context. When I use Persian words in a scientific text – for example: "The principle of vilāyat-i faqīh remains a matter of dispute." – then I do need exact reflection of spelling. But here on wiktionary we don't do that. Instead we always give the original spelling and only add the transcription for the sake of pronunciation. Your dialectal argument is not valid, because none of the letters in question are distinguished in any dialect of Persian. (You might be able to make a case concerning vowels, however.)
2.) We do use "j" for Arabic ج. I don't think this is a recent change either, but maybe it was. In fact, I prefer using ǧ, because the letter is pronounced /g/ replace g with ɡ, invalid IPA characters (g) or /ɟ/ in some accents, and is etymologically g as well.
3.) In your proposed transcription you may want to use underlined s instead of underlined t for ث for the sake of consistency. And switch the representations of ض and ظ. That would leave you with the official DMG transcription for Persian. Again, however, I don't think we need it here. Kolmiel (talk) 23:42, 13 February 2017 (UTC)Reply

ou->ow[edit]

@ZxxZxxZ, Irman, Dijan: Hi. User:Kaixinguo~enwiktionary insists on using "ow" for the diphthong [ou] instead of "ou". Do you agree with this? (Please ping any active Persian editor if I missed). --Anatoli T. (обсудить/вклад) 10:31, 4 January 2018 (UTC)Reply

I agree, because Persian-speakers normally use ou" for long /u/, it's ambiguous. --Z 13:22, 4 January 2018 (UTC)Reply
@ZxxZxxZ, Kaixinguo~enwiktionary, Irman, Dijan: Thanks. I have updated the policy as the preferred one and added some other symbols used occasionally in diff. Please follow the transliteration policy you have endorsed! :) --Anatoli T. (обсудить/вклад) 22:06, 4 January 2018 (UTC)Reply
@ZxxZxxZ, Kaixinguo~enwiktionary, Irman, Dijan For the record, I didn't 'insist' on this at all, I made one edit to a word where he had put an erroneous transliteration and I used 'ow' because it was my belief that that was the recent consensus. User:Atitarev's decision to edit the policy is premature. And there is no need to tell User:ZxxZxxZ to follow the policy, when he has always followed every policy anyway. It comes across in a bad way. In fact, it was User:Atitarev who decided unilaterally to change our standard of capitalising proper noun transliterations and then altered only a few of the entries, leaving half in one way and half the other. Kaixinguo~enwiktionary (talk) 11:42, 15 January 2018 (UTC)Reply
@Kaixinguo~enwiktionary: The erroneous transliteration you are talking about in خودرو wasn't my edit, I only corrected "kh" to "x". I don't know why you are trying to put me in a negative light. Converting to lower case wasn't my decision either. The "ow" rule could never be followed if it was never part of the policy. It was always "ou", I changed it since everyone seemed to agree it was better than "ou". --Anatoli T. (обсудить/вклад) 11:57, 15 January 2018 (UTC)Reply
I'm sorry for the late reply. No, not all of us are in agreement. I disagree with the change from "ou" to "ow". Following Z's logic of ambiguity, should we also use "oo" instead of "u" and "eh" instead of every "e"? --Dijan (talk) 16:41, 17 January 2018 (UTC)Reply
That's a different case, a one-letter transcription is better than a two-letter one. But there's no advantage in using "ou" instead of "ow". --Z 12:36, 22 January 2018 (UTC)Reply

⟨نب⟩[edit]

@ZxxZxxZ, Vahagn Petrosyan, Kutchkutch, Raxshaan, Victar, RonnieSingh – ping whom I have forgot: should we render ⟨نب⟩ in |tr= as ⟨nb⟩ or ⟨mb⟩? Current usage is mixed. I think the latter is to be opted for; it had the sound values [mb] the whole history and already in Middle Persian, and Middle Persian reconstructions should also write it, it appears that they wrote m in the Pahlavi script and differing practice was taken over from the Arabs. Example Neo-Persian pages: انبه, چنبر, تنبک, تنبان, انبار, تنبسه, جنبیدن, انبوییدن, دنب, زنبق. Even though I have exactly the opposite standing for Arabic, that ⟨نب⟩, though pronounced [mb] – it appears in borrowed words like شَنْبَر (šanbar), إِنْبِيق (ʔinbīq), أَنْبُوب (ʔanbūb) and not so in native root formations –, should be rendered ⟨nb⟩ consistently, and transcription is automatic anyway. Fay Freak (talk) 15:57, 1 September 2020 (UTC)Reply

Persian automated transliteration discussion[edit]

There is a new discussion on automating Persian transliteration by providing diacritics in the headword - Wiktionary:Beer_parlour/2021/December#Persian_automated_transliteration. --Anatoli T. (обсудить/вклад) 23:41, 15 December 2021 (UTC)Reply

Short "i" in front of a ye[edit]

@Ariamihr, Dijan, Mazsch, Qehath, ZxxZxxZ, Tibidibi, Kaixinguo~enwiktionary, Taimoorahmed11, Fay Freak, Fenakhay, Benwing2: Hi, is anyone still active with Persian? Sorry for many pings.

I'd like to revive some interest in automated transliterations and I have a question regarding the use of the short "i" before a ye + another vowel in modern Iranian Persian, such as خیال (xiyâl) (alternative reading of "xayâl") where we normally use "e". Are there more cases where a short "i" (not "e") would be required? Would vocalise still it as خِیال (alternative of خَیال)? Anatoli T. (обсудить/вклад) 00:54, 13 September 2022 (UTC)Reply

Bot request for Persian transliterations[edit]

(Notifying Ariamihr, Dijan, Mazsch, Qehath, ZxxZxxZ, Sameerhameedy): Hello. I have requested @Benwing2 to standardise transliterations based on the current consensus in this discussion, initially as a simple clean-up but it may possibly impact a bit more:

Wiktionary:Grease_pit/2023/March#Bot_request_for_Persian_transliterations_(fa)

Pls contribute there if you disagree or wish to add. Anatoli T. (обсудить/вклад) 00:32, 8 March 2023 (UTC)Reply

It's also a good time to fix and address any bad transliterations found. Anatoli T. (обсудить/вклад) 00:33, 8 March 2023 (UTC)Reply

ezâfe after ی is unmarked[edit]

(Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): Hi. Apparently, there are cases when ezâfe is totally unmarked but can be guessed and affects prununciation. بازی شطرنج (bâzi-ye šatranj, a game of chess).

Would you vocalise it as بازیِ شَطْرَنْج? We will need a note in this policy doc.‎

Also, there are some unanswered questions in Module talk:fa-IPA. Please also check Module:fa-translit/testcases. You can comment in the talk page. Anatoli T. (обсудить/вклад) 02:27, 21 March 2023 (UTC)Reply

@Atitarev Yes, I would vocalize it that way. This is consistent e.g. with this site (at my old university): [1] I'm also thinking if the word ends in written heh but with ezafe pronounced -ye, we should add the hamza diacritic over it (plus kasra) when vocalizing. The hamza diacritic can be set to be automatically removed when creating links, or left in (maybe similar to the Russian ё, which we include in entries), whatever is the preference. Benwing2 (talk) 02:37, 21 March 2023 (UTC)Reply
@Benwing2: Thanks. After you have removed the link from the hamza diacritic over heh, this entry is set the way it should: کرهٔ جنوبی (kore-ye jonubi). The article doesn't have هٔ but the |head=} uses کرهٔ جنوبی. where کرهٔ links to کره. No |alt= is required if هٔ (or diacritics) is the only difference from the page name. Anatoli T. (обсудить/вклад) 02:47, 21 March 2023 (UTC)Reply
If I understand the situation correctly, there is no need for a kasra after a hamza diacritic added to a final unvocalized heh. I mean, it is not even needed for technical purposes, since tools can easily see that the hamza represents an ezafeh. Z 20:56, 21 March 2023 (UTC)Reply
@ZxxZxxZ: Thanks, Z. Using the test translit module, I used کُرِهٔ شُمالی and it successfully transliterates as "kore-ye šomâli". No kasra after a hamza diacritic. Also per Wiktionary:Persian transliteration (towards the bottom of the page). Anatoli T. (обсудить/вклад) 22:57, 21 March 2023 (UTC)Reply

ـه‌ای as e-yi[edit]

(Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): I've just added this line:

  • ـه‌ای - e-yi (previously we used "e-i" but I think it makes much more sense as "e-yi")

It's also in Module:fa-translit/testcases

Please let me know if you disagree.

@Benwing2. Are you able to take a look at some cases there?

I will also add this case, mentioned earlier: بازیِ شَطْرَنْج (bâzi-ye šatranj, a game of chess). Anatoli T. (обсудить/вклад) 01:21, 22 March 2023 (UTC)Reply

@Atitarev I propose:
  1. Some do pronounce it as e-yi, but I think e-i is more formal and more common for ـه‌ای
  2. i-ye is better for cases like بازیِ شَطْرَنْج‎ (bâzi-ye šatranj, “a game of chess”).
  3. also cases like آمْریکایی/آمْریکائی (âmrêkâ-i, âmrêkâ-yi, "american") which have long vowels at the end have to be considered.
  4. ezâfe for words which have a hamza at the end after a long vowel like یاء (yâ') will remove the hamza یایِ (yâ-ye)
    • unless in cases like مَبْدَأْ (mabda') where ezâfe will be مَبْدَأِ (mabda'e) and indefenite and adjective will be مَبْدَئی (mabda'i)
Light hearted sam (talk) 16:02, 9 May 2023 (UTC)Reply

Re use of hyphens[edit]

@Benwing2: You asked me about hyphens (when NOT to use them). Sometimes templates or editors add hyphens when it's not really necessary, for etymological or readability reasons, e.g. plural دیکتاتورها @دیکتاتور (diktâtor). Both "diktâtorhâ" and "diktâtor-hâ" are fine by me, so I don't have a strong opinion. Just an FYI. Anatoli T. (обсудить/вклад) 04:42, 22 March 2023 (UTC)Reply

@Benwing2 Hi. I sort of tried to formalise the use of hyphens at the bottom of the bottom of the page and described some other corner cases I could think of. Please let me know if you think of something else and if you have questions. Anatoli T. (обсудить/вклад) 06:08, 31 March 2023 (UTC)Reply
@Atitarev Thanks, makes sense although I'm a bit confused about your statement about "no space or ZWNJ rendered as nothing", shouldn't it say "ZWNJ" rendered as hyphen? Also I should be able to get to Module:fa-IPA in a day or so. Benwing2 (talk) 06:22, 31 March 2023 (UTC)Reply
@Benwing2: I meant "if there is no "ZWNJ" then no hyphen is used (for etymological reasons - compounds, suffixes), except for cases described. I will review, thanks. This part might be challenged and the current practice is inconsistent. Anatoli T. (обсудить/вклад) 06:27, 31 March 2023 (UTC)Reply

Arabic loanwords with definite article ال[edit]

(Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): Hello. I am going to add a clause (soon) about cases like فارغ‌التحصیل (fâreğ-ot-tahsil). Does everyone want the page to show:

  1. Assimilation of the Arabic sun letters, here "ol" -> "ot".
  2. Separate with a hyphen? "fâreğ-ot-tahsil" or "fâreğ-ottahsil"?

There were other changes on the page. Pls comment if you wish to change/add or discuss. Anatoli T. (обсудить/вклад) 01:44, 3 April 2023 (UTC)Reply

@Atitarev IMO 'fâreğ-ot-tahsil' with hyphen and assimilation looks good, although of course it depends on the pronunciation; if pronounced with /l/, it should have 'l' instead of 't'. Benwing2 (talk) 02:03, 3 April 2023 (UTC)Reply
@Benwing2: Thanks. It's pronounced with a geminated /tt/, imitating Arabic. I've come across similar borrowings before. There are two recordings on Forvo with /tt/. Anatoli T. (обсудить/вклад) 02:07, 3 April 2023 (UTC)Reply
@Atitarev, @Benwing2
I agree with Benwing2.—Saranamd (talk) 08:44, 3 April 2023 (UTC)Reply
@Saranamd @Benwing2 @Atitarev And so do I. Rodrigo5260 (talk) 11:48, 3 April 2023 (UTC)Reply
@Atitarev, @Benwing2 I agree with others, too. If a hyphen is going to be removed, it is the first one, because these phrases have been borrowed with the definite article al- frozen with the previous word. This is obvious in the case of ممنوع‌الکار (mamnu'-ol-kâr), where کار kâr is a Persian word and the phrase is formed in Persian (see this newspaper). In fact I believe we should have entries such as ممنوع‌الـ (mamnu'-ol-). --Z 14:50, 3 April 2023 (UTC)Reply

Marking long vowels[edit]

(Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd, Atitarev): There is an issue with the current transliteration policy I would like to address, namely that there is no standard way to transliterate Classical Persian and Dari. This is a problem for multiple reasons but not to mention that Classical Persian was one of the most influential languages in West, Central, and South Asian history! Yet there is no standard transliteration for it?


The biggest issue is the fact i is a long vowel in Modern Iranian Persian but a short vowel in Classical Persian (classical اِ and اِی differ solely by length, not quality). However since i is always a long vowel in Iranian Persian, why don't we just mark it as such? Then it won't conflict with other transliterations, and the transliteration systems used will be in sync with each other. I'll think this is the best system:


long vowels: â ē ī ō ū

  • (long ē and ō, though uncommon, do appear in Iranian Persian)

and the short vowels: a e i o u

  • (short e and o can appear in Dari and CLS. As, with few exceptions, Ma'rūf vowels shift to Majhūl vowels near glottal consonants).
  • (the use of a macron '¯' for all except â is intentional, as in classical persian all other vowels were distinguished solely by length. And the vowels i and u are always long in IR)
  • I think Dari and Classical should use the same transliteration; Then there would be only 2 spellings in the header max, instead of 3.


New System, Examples:[edit]

شیر • sīr (all varieties have been collapsed into ONE wholly accurate transliteration!)

شیر • sīr or sēr (two, non conflicting transliterations)

دل • del or dil

جیلبی • jalēbī (not a word in Iranian Persian, so no alternative transliteration.)

گلاب • golâb or gulâb (imo, this is understandably a difference in quality. as both vowels are transliterated as short vowels).


If we used the two current transliteration systems we would end up with an unnecessary listing of multiple transliterations

Current System, Examples:[edit]

شیر • sir or sîr (this is dumb, we could write them in one transliteration! Especially since Iranian 'i' and the CLS/Dari 'î' used here do not differ in quality!)

شیر • sir or sêr (in CLS/Dari 'i' can be a short vowel)

دل • del or dil (in Iran 'i' can be a long vowel,)

جیلبی • jalêbî (not a word in Iranian Persian, but this gives the implication that 'î' is different in quality from Iranian 'i', when in reality î is only different from CLS/Dari 'i')

گلاب • golâb or gulâb (very ambiguous due to the varying lengths of 'u' and 'o')


This would also cause less inconsistencies in etymology sections, when fa-cls has to be specified in order for the transliteration to be accurate to Central and South Asian languages.


Additionally there would be little issue with converting 'i' and 'u' > 'ī' and 'ū' as pages where i or u are short are almost always marked with 'classical' 'Dari' 'India' 'Archaic' 'Kabul' 'Hazaragi', for pages with such markings I am willing to manually review them myself to fix them. (Seriously, I think this change would be worth the effort!)


There are two standard language variety's under the banner of "Persian" and with this adjustment, a single transliteration system can be loosely applied to both of them, as well as Classical Persian! I love Iranian Persian but as someone from a Dari speaking family, it sucks to see Persian Wikipedia and Persian Wiktionary outright exclude Dari. (Persian Wikipedia excludes the mere mention of how countries are spelt in Dari. I could always tell the spellings used on PW didn't match my families dialect. I had to check Dari news broadcasters such as Tolo News and DW Dari to find the standard Dari spellings of country names, and add them to Wiktionary!) Educational material in Dari is already so difficult to come by, why turn down opportunities to make educational material more accessible to Dari speakers? Or Afghan Diaspora?


Prioritizing Iranian Persian makes sense, but Iranian Persian can be prioritized without the exclusion of Dari and Classical Persian. Persian Wiktionary is currently the only online dictionary with Dari, Classical, and Iranian Persian and can easily serve all dialects with only a minor adjustment. So I would like to vote on changing the current transliteration Policy so that it can be loosely applied to all varieties. Or at least discuss the current transliteration policies. As the population of Dari Speakers with internet access increases, I hope Persian Wiktionary can become a project for them too. Sameerhameedy (talk) 01:25, 1 May 2023 (UTC)Reply

@Sameerhameedy: Hi.
Most of what you suggest makes sense (theoretically). Only I am not sure about the inconsistencies with macrons over carrots, even if long and short â/a are different by quality as well.
I saw some of your previous edits where you attempted to label transliterations by the variety. It's not currently sustainable. I have reverted them, when going through a list of bad transliterations, which @Benwing2 has made. The current issues per your proposal:
  1. No language currently supports mutilingual transliterations, nothing went beyond discussions. That is without splitting into different languages. |tr2=, |tr3= should go along with |head2=, |head3=, etc. They are not labelled. So, simply using different transliterations without labels, formats or colours, will be confusing.
  2. It is possible to do all the work in the pronunciation section (but one variety should be the default). Not just IPA but generated or manually input transliterations and vocalisations. I already pointed you to the Chinese pronunciation sections, now coming with lots of different transliterations. BTW, if Persian were consistently vocalised, it would be easier to derive the transliterations and IPA from the Perso-Arabic spellings. Currently, everything has to be done by a bot, including policy changes, which is a lot of work.
  3. Lack of Dari or classical Persian is due mostly to the lack of online resources and editors or their own desire. I found some doanloadable dictionaries and these two https://dsal.uchicago.edu/dictionaries/persian/ (Hayyim and Steingass combined) is sort what we need but it's not error-free and often uses Arabic letters instead of Persian (not the only one like this).
  4. Another way is to make Classical Persian the default. Some people supported this idea but this may put off contributors and it may be confusing to use modern or colloquial words using that transliteration.
Consistently using "î" and "û" (or "ī" and "ū") instead of simple "i" and "u" for long vowels is more achievable than the rest.
What do you think of the split into Iranian Persian/Classical Persian and Dari? Do you think the idea will be supported by the community? Is it worth it, considering the amount of duplicated contents? Anatoli T. (обсудить/вклад) 02:14, 1 May 2023 (UTC)Reply
@Atitarev, Yes I stopped trying to label the transliteration in the header because someone (I don't remember?) made me aware that it caused certain issues. BTW is it really possible to use color to distinguish different varieties in the header? I think that could be a viable solution if that's actually possible.
But no, I would personally prefer to take the approach of Korean and (to a lesser extent) chinese wiktionary, than split Classical and Dari Persian from Iranian Persian. (Unless other contributors here believe that doing so is for the best.) Iranian and Dari Persian mainly diverged in vowels, so that means that words are usually still spelt the same in the Arabic script. (Excluding the names of newer technologies and countries which are usually unrelated) Unless vowels are written (as they are in Transliterations) the differences are not usually noticeable in writing. I think, considering that, it's just easier to have multiple transliterations on the Persian Wiktionary.
I'm fine with Iranian Persian being the default, I just would prefer that Dari and Classical weren't completely excluded. (On chinese wiktionary, other varieties can be used in examples and quotes, though they have a little language marker next to them. I think that could be helpful as well.) There isn't even a conjugation template for Dari, and all conjugation templates include Standard Iranian Persian and Tehrani. Since conjugation tables are collapsible there's no reason there can't be one for Iranian Persian and one for Dari/Classical. Especially since some pages include multiple tables for other regional dialects in Iran.
Though for automatic generations of transliterations and IPA transcriptions; Classical Persian should be the default, as it already is for the fa-IPA template, since classical can (usually) be accurately converted to its descendants.
Also, I assumed the lack of Dari was due to the lack of Internet access in Afghanistan. Though I noticed many south Asian languages on Wiktionary had already been using Classical Transliterations (but only when the Iranian pronunciation was extremely different from the classical pronunciation). And the fa-IPA template uses classical Persian, so there is already a widely used Classical transliteration; It just isn't included in the Persian translation policy.-Sameerhameedy (talk) 03:31, 1 May 2023 (UTC)Reply
@Sameerhameedy, @Benwing2: When Persian is modularised, it is already possible to use this, even with the current Persian templates and a bit of work:
Option 1 - optional multiple parameters in the headword, similar to Serbo-Croatian, Hindi/Urdu pairs. Note that it was used for two varieties on these pairs even before they were modularised. It can get messy, if there many |head=, |tr=, |pl=, |pltr= (and their 2, 3, etc. versions)
گُلاب • (golâb), Classical Persian: گُلاب • gulāb, Dari Persian: گُلاب • gulāb
Option2, headwords on multiple lines
گُلاب • (golâb) (Iranian Persian)
گُلاب • (gulāb) (Classical Persian)
گُلاب • (gulāb) (Dari Persian)
(Just an idea, pls ignore any transliteration issues, I am not familiar with classical/Dari) Anatoli T. (обсудить/вклад) 05:19, 1 May 2023 (UTC)Reply
@Atitarev I think option 2 is easier to read. For option 2 would it be possible to collapse headers when they're the same? like:
گُلاب‎ • (golâb) (Iranian Persian)
گُلاب‎ • (gulāb) (Dari, Classical)
but when none can be collapsed together:
جاروب‎ • (jârub) (Iranian Persian)
جاروب‎ • (jārōb) (Classical Persian)
جارُوب‎ • (jārūb) (Dari Persian)
If they always need to be separate, then that's fine. Just wondering if it's possible to keep the headers condensed wherever possible?
I think 2 is the best option but i'm a little worried about where plurals would go. سَمِیر حَمِیدِی (talk) 08:15, 1 May 2023 (UTC)Reply
@Sameerhameedy: Each definition line is manual, separated by </ br>. I am thinking that applicable varieties can be flagged by an additional parameter. If you, e.g. apply |pes=1 (Iranian Persian) on one line, and |prs=1, |cls= on the other, you get Iranian Persian on one line and Dari, Classical on the other (your first example). At least one will be made mandatory or default to Iranian Persian, if unspecified.
Basically you will need those parameters and corresponding transliteration parameters enabled |prstr=, |prstr2= and |clstr=, etc. The parameter |head= should be made redundant as in Arabic, it should be unnamed but |head2=, |head3= made optional. |head= is needed if you want to provide vocalisations. The headword can become quite complex but not as complex as Arabic. If it gets too crowded, the classical and Dari can be made on separate lines, even if they are identical.
I am not a module developer, just assessing what can be done and sharing ideas.
(Your signature is cool but it will make you unrecognisable to most users on Wiktionary) Anatoli T. (обсудить/вклад) 08:38, 1 May 2023 (UTC)Reply

Standardising transliterations - remaining points 2[edit]

Someone did this a while ago on Wiktionary talk:About Persian, I thought to also do this to clear up some remaining points about Persian transliteration. Please vote for your preferences. state the reason if you can.

1. 'nb' or 'mb' for نب ?[edit]

Example: دوشَنْبه‎ as 'došanbe' or 'došambe'.

'nb' (došanbe)[edit]

'mb' (došambe)[edit]

2. ـه‌ای‎ as 'e-i' or 'e-yi'?[edit]

Example: خانه‌ای as 'xâne-i' or 'xâne-yi'

'e-i' (xâne-i)[edit]

'e-yi' (xâne-yi)[edit]

3. should different forms of the same sound (like س ث ص , ز ذ ظ ض, etc...) be represented differently in transliterations?[edit]

Examples: if yes, سوریه will be 'suriye' but ثُرَیّا will be 'ṯorayyâ'. for now, غ and ق are represented differently (ğ and q, respectively) while having the same sound in Iranian Persian.

yes, should be different (suriye and ṯorayyâ)[edit]

no, shouldn't be different (suriye and sorayyâ)[edit]

  • Support It makes sense to use different symbols but it never happens in practical terms, also because most dictionaries don't make a distinction. It would be more practical even to discuss, if Persian consistently used vocalisation and the transliteration was (mostly) automated. --Anatoli T. (обсудить/вклад) 23:30, 15 May 2023 (UTC)Reply

4. و (long vowel) as 'u' or 'ô'?[edit]

Example: روزْ as 'ruz' or 'rôz'

u (ruz)[edit]

ô (rôz)[edit]

Support but ONLY for Classical and Dari. — This unsigned comment was added by Sameerhameedy (talkcontribs).
@Sameerhameedy: This may work if multiple headwords are used, with clear labels, which transliterations belongs to which Persian variety. Multiple transliterations on the same headword can also work but can become crowded when there are many. (Referring to previous discussions we had). — This unsigned comment was added by atitarev (talkcontribs).

5. ی (long vowel) as 'i' or 'ê'[edit]

Example: پَهْلَوی as 'pahlavi' or 'pahlavê'

i (pahlavi)[edit]

ê (pahlavê)[edit]

Support but ONLY for Classical and Dari. سَمِیر | sameer (talk) 22:12, 15 May 2023 (UTC)Reply
I think this should be combined with the discussion/vote on how to accommodate multiple transliterations. Anatoli T. (обсудить/вклад) 23:44, 15 May 2023 (UTC)Reply
@Atitarev yes, maybe we should vote on how to accommodate multiple transliterations. I wanted input from other editors on what they thought would be the best way but I didn't get a lot of responses. سَمِیر | sameer (talk) 23:52, 15 May 2023 (UTC)Reply

6. hyphens for plurals using ها (hâ)[edit]

I support hyphens in ZWNJ on joining letters and non-joining letters[edit]

Examples: کِتابْ‌ها as 'ketâb-hâ', کِتابْ‌ ها as 'ketâb hâ', کِتابْها as 'ketâbhâ', and اُتوها as 'otu-hâ'

I support hyphens only in ZWNJ on joining letters[edit]

Examples: کِتابْ‌ها as 'ketâb-hâ', کِتابْ‌ ها as 'ketâb hâ', کِتابْها as 'ketâbhâ', and اُتوها as 'otuhâ'

7. should ئ and ؤ (') link to ی and و ?[edit]

Examples: سُؤالْ (so'âl) link to سوال and هِیْئَت (hey'at) to هییت.

no, سوال should be a misspelling of سؤال[edit]

Support Should reflect standard spelling which includes hamza. Though I believe this was discussed before. سَمِیر | sameer (talk) 22:14, 15 May 2023 (UTC)Reply
@Sameerhameedy: BTW, this hasn't been actioned accordingly. The links work correctly. سوال should be moved to سؤال. The |head= could use "سُؤال" but "ؤ" is part of the page name. Anatoli T. (обсудить/вклад) 00:15, 16 May 2023 (UTC)Reply
Support This was discussed and the decision was made, not a new question. --Anatoli T. (обсудить/вклад) 23:34, 15 May 2023 (UTC)Reply

yes, they should[edit]

8. what way should words like سَرْاَنْجامْ which the اَ (a) sound is separate from the consonant before be translated?[edit]

should be like sar'anjâm[edit]

should be like sar-anjâm[edit]

If you think anything else should be also added let me know. Light hearted sam (talk) 15:46, 15 May 2023 (UTC)Reply

(Notifying Ariamihr, Atitarev, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): . also @Fenakhay, Fay Freak, Benwing2, Erutuon, Tibidibi, Kaixinguo~enwiktionary if your interested. Light hearted sam (talk) 16:09, 15 May 2023 (UTC)Reply
@Light hearted sam Same of these are hard to answer because they refer to different dialects though. روز would be 'ruz' in Iranian persian but 'rōz' in Classical, Dari, and Tajik. سَمِیر | sameer (talk) 22:05, 15 May 2023 (UTC)Reply
@Light hearted sam There are some fundamental flaws with this poll. 1st, It does not take into account that و could be multiple vowels in all dialects. Yes, in Iranian Persian و and ی are usually the ma'rūf vowels (u and i), but the majhul vowels (ê and ô) still appear!! And in other dialects, the majhul vowels ê and ô (more accurately ē and ō for them) are actually extremely common. It's less common in Iranian Persian as it has largely shifted away from the majhul-maruf vowel system present in Classical and Dari, but majhul vowels still do appear in all dialects! Including Iranian Persian!
Saying that و must always be 'u' and that ی must always be 'i' would not only be problematic for Iranian Persian (granted in only a few words) but would also require Dari and Classical Persian to be entirely separated from Iranian Persian. Especially for Dari specific words or archaic words that don't exist in modern Iranian Persian and can't use that transliteration. سَمِیر | sameer (talk) 22:33, 15 May 2023 (UTC)Reply
Oh your right.
So it should be like this:
i and u for ma'ruf ی and و
ē and ō for majhul ی and و Light hearted sam (talk) 09:41, 16 May 2023 (UTC)Reply
There are currently issues/inconsistency's with our transliteration system due to the phonological differences between Iranian Persian and Other dialects (Classical & Dari). There are discussions of fixing these systems, but as of right now Iranian Persian is the only standard transliteration but there is a de facto transliteration system for Classical and Dari when the Iranian Transliteration can't be used.
Current Practice:
For Iranian Persian: i and u for ma'ruf ی and و
For Classical Persian and Dari: î and û for ma'ruf ی and و
For All Dialects: ê and ô for majhūl ی and و
System Under Discussion:
For Iranian Persian: i and u for ma'ruf ی and و; ê and ô for majhūl ی and و
For Classical Persian and Dari: ī and ū for ma'ruf ی and و; ē and ō for majhūl ی and و سَمِیر | sameer (talk) 00:17, 17 May 2023 (UTC)Reply

Standardization of Classical + Dari Persian[edit]

(Notifying Ariamihr, Atitarev, Benwing2, Dijan, Mazsch, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd)

Based on several conversations, Classical Persian and Dari Persian now have a (shared) standard transliteration. You may review it at Persian transliteration/Dari. Leave any concerns on the corresponding talk page.

The transliteration scheme utilized by Iranian Persian has not been changed.

سَمِیر | sameer (talk) 00:34, 5 June 2023 (UTC)Reply

@Sameerhameedy: Hi. The pings didn't go through, since the ping and the signature needs to be in the same edit. Repeating the pings: (Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): Anatoli T. (обсудить/вклад) 00:30, 7 June 2023 (UTC)Reply