Template talk:fa-regional

From Wiktionary, the free dictionary
Latest comment: 8 months ago by Fay Freak in topic Possibly indiscriminate use
Jump to navigation Jump to search

Couldn't this be given a better name? It's not just for {{fa}} entries, and "regional" doesn't quite explain it. --Μετάknowledgediscuss/deeds 01:59, 22 July 2012 (UTC)Reply

Optional 2nd Dari[edit]

@Erutuon: Hi. Are you able to add an optional 2nd Dari parameter, e.g. dari2, please?

E.g. I want to add on اتوبوس (otobus) both Dari words بس (bas) and سرویس (sarvis). Anatoli T. (обсудить/вклад) 05:22, 9 March 2023 (UTC)Reply

Done, responded in Grease Pit. — Eru·tuon 04:16, 30 March 2023 (UTC)Reply

Possibly indiscriminate use[edit]

(Notifying Ariamihr, Atitarev, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy):

I don't think it's particularly useful to indiscriminately add this temple with the same entry for Dari and Iranian, and a Cyrillic transliteration for Tajik. For example, چراغدان, صقلیه, اسلوب.

This template should ideally show actual lexical variation in standard/common language, e.g. حرف زدن, ماشین, کره.

The fact is that all major classical Persian writers such as Sa'di, Hafiz, Nasir-i Khusraw, Firdawsi, have been transliterated into Cyrillic. So for words with any antiquity, you can likely find the "Tajik" form in a book somewhere. This doesn't mean that they're necessarily in common use in Tajikistan.

If we are to add Cyrillic/Tajik forms to all Persian words, they should be added to the header like other languages with multiple scripts (Hindi/Urdu, Serbo-Croatian).

The difference with {{ko-regional}} is that only a small minority of Korean words differ between NK and SK, meaning that the template is useful, whereas {{fa-regional}} in its current application could theoretically be added to almost every single entry showing the exact same word for Dari and Iranian and the exact same word in Cyrillic for Tajik. I don't feel this is useful, and does not fit the precedent of other languages with multiple scripts.--Saranamd (talk) 14:03, 17 September 2023 (UTC)Reply

@Saranamd I agree 100%, fa-regional should only be used if the word is dialectal. Else the Tajik spelling should be in the header. Currently fa-regional is the only way to show the Tajik spelling which is the only reason it's included for terms that aren't dialectal.
Though, i'm not sure if it's possible for a bot to tell if fa-regional is being used on a dialectal term or not (dialectal between Dari and Iranian Persian is possible but if the Tajik term was different IDK how a robot could tell). So that change would probably have to be done manually. Unless Ben knows a way for a bot to discern that. سَمِیر | Sameer (مشارکت‌هاکتی من گپ بزن) 14:22, 17 September 2023 (UTC)Reply
@Sameerhameedy @Saranamd I'm not sure whether it's possible by bot to tell whether a given Tajik term is expected or unexpected; I think we'd need to know the Dari term including all the vowels and map the Dari vowels to the Tajik vowels? But I can add a param for Tajik to the headword; this is similar to what's done for Hindi/Urdu. Benwing2 (talk) 17:09, 17 September 2023 (UTC)Reply
@Benwing2 hmm maybe a bot can check if the Tajik romanization matches the Tajik reading in fa-IPA? Though the Tajik reading is not a perfect 1=1, it might be close enough for the bot to tell when Tajik is the same word most of the time? The main issue I can think of is that Tajiks adoption of the glottal stop is a little random.
On the other hand... fa-IPA could generate an accurate Tajik spelling <90% of the time.. But I think that might not be a good Idea, because no other language on wiktionary includes alternative scripts in the pronunciation section. I also discussed this with @Atitarev who thought it could be problematic for Persian words not attested in Tajik. Perhaps it could generate them temporarily though, just for a bot to use for checking (and make them invisible in the meantime)? سَمِیر | Sameer (مشارکت‌هاکتی من گپ بزن) 19:16, 17 September 2023 (UTC)Reply
@Sameerhameedy Yeah I think that is possible although it would help if you can give me some pointers on the Cyrillic <-> reading equivalents. In terms of generating the Tajik spelling, I think we'd have to add a param turning on the generation of the spelling (and reading?), and not do it by default. E.g. |tg=+ to autogenerate the Tajik spelling and |tg=фообар to specify a particular spelling. It's possible to run a bot to add this param based on existing Tajik entries. Benwing2 (talk) 19:36, 17 September 2023 (UTC)Reply
@Benwing2 The Tajik reading is based on the character mappings at Module:tg-translit and TG TR, and all the character equivalents should be the same. fa-IPA shouldn't generate a Tajik reading if nothing was entered into the blank field |1= or the field |tg=, I think Atitarev was worried about some situations where a (former) colloquial pronunciation becomes standardized in Tajik.
On generating the spellings; If it's possible to get the header template to "fetch" the tajik reading in fa-IPA, I could make a backwards version of tg-translit to generate a cyrillic spelling in the header? Then the bot would just need to check if the text in the header and in fa-regional are the same, and delete fa-regional if it's redundant?? سَمِیر | Sameer (مشارکت‌هاکتی من گپ بزن) 20:28, 17 September 2023 (UTC)Reply
@Sameerhameedy, @Saranamd, @Benwing2: I agree with the suggestion, if you desire to limit to cases where words are different. Knowing the standard mappings between Persian and Tajik, like "â" = Cyrillic "о", it's okay to limit to words that are definitely different. Perhaps also, where there are multiple variants, as in آلمان: {{fa-regional|جَرْمَنِی|آلْمان|Олмон|tg2=Германия}}
Re: glottal stop in Tajik: it's totally ignored at the beginning of words but you can find them in the middle or end of words.
  1. Typically "ъ" is used for Persian "ع" in Arabic loanwords, e.g. таъриф (taʾrif) = تَعْریف (ta'rif). Less commonly or never (?) for hamze.
  2. Not required in positions, such as زیبائی (zibâ'i) = زیبایی (zibâyi) = зебоӣ (zeboyī)
Anatoli T. (обсудить/вклад) 07:53, 18 September 2023 (UTC)Reply
Actually, "ъ" can stand for Arabic hamza as well, e.g. нашъа (našʾa) from Arabic نَشْأَة (našʔa) (?).
@Fenakhay, @Allahverdi Verdizade: is this right? What is نَشْأَة (našʔa)? Anatoli T. (обсудить/вклад) 08:15, 18 September 2023 (UTC)Reply
@Atitarev نَشْأَة‎ (našʔa) seems to mean "origin, upbringing, early life" etc. I don't see any relation to cannabis. Benwing2 (talk) 19:33, 18 September 2023 (UTC)Reply
It's possible this is rather from نَشِقَ (našiqa, to inhale, to sniff). /q/ changes into a glottal stop in several Arabic lects. Benwing2 (talk) 19:36, 18 September 2023 (UTC)Reply
@Benwing2: Thanks, it's diff by @Allahverdi Verdizade, which misled me. Anatoli T. (обсудить/вклад) 20:15, 18 September 2023 (UTC)Reply
Не могу въехать, чем вам не угодило заимствование из перс. نشئه и далее из арабского этимона, который там указан. Allahverdi Verdizade (talk) 00:20, 20 September 2023 (UTC)Reply
@Allahverdi Verdizade How did "origin" change into "drunk"? This should be clarified in the etymology. Benwing2 (talk) 01:23, 20 September 2023 (UTC)Reply
I’d rather put my bet on نَشْوَة (našwa, intoxication), more often borrowed into languages. Fay Freak (talk) 01:44, 20 September 2023 (UTC)Reply
@Fay Freak, @Allahverdi Verdizade, @Benwing2: It has a Persian descendent نشوه (našve) Anatoli T. (обсудить/вклад) 03:07, 20 September 2023 (UTC)Reply
The verbal noun ن ش ء literally means to 'be high'. Allahverdi Verdizade (talk) 09:54, 20 September 2023 (UTC)Reply
Literally. Not as in English applied to drugs, as far as we know. For that one uses ن ش و (n-š-w), wherefore ELA II sensibly derived this root from the literal one. And نشوه (našve) is therefore the original of نشئه (našʼe), of which we list other misspellings. Perhaps و was read as ؤ (ʔ). Fay Freak (talk) 12:30, 20 September 2023 (UTC)Reply
@Benwing2 So do you think I should make a reversed version of Module:tg-translit or are you going to just compare transliterations via bot? Unless you have your own plans to deal with this issue, in which case i'll leave it to you. سَمِیر | Sameer (مشارکت‌هاکتی من گپ بزن) 02:39, 19 September 2023 (UTC)Reply
@Sameerhameedy Sorry, I don't quite have the context on what would make the most sense. Yes it's possible to have the headword code call into {{fa-IPA}} to get the Tajik reading. Probably what would make the most sense is to have a reverse-translit version that generates Cyrillic from the reading; then a bot can convert Tajik words in {{fa-regional}} into |tg=+ if it matches what the auto-generated Cyrillic would be, and otherwise convert into |tg=фоо where фоо stands for the manually entered unpredictable Cyrillic spelling. We'd still have to decide under what circumstances it makes sense to keep {{fa-regional}} if the information is also available in the header. Benwing2 (talk) 02:55, 19 September 2023 (UTC)Reply
@Benwing2 Okay well if you're leaning into generating Cyrillic spellings i'll start working on a backwards transliteration module. Feel free to stop me if you change your mind or anything.
For now, the bot should just remove fa-regional if it's entirely redundant (i.e. all dialects are the same as the header), this should actually remove fa-regional the majority of the time. Then the amount of entries needing manual checking should be very small. سَمِیر | Sameer (مشارکت‌هاکتی من گپ بزن) 03:46, 19 September 2023 (UTC)Reply