Module talk:inc-ash/dial

From Wiktionary, the free dictionary
Latest comment: 3 years ago by Kutchkutch
Jump to navigation Jump to search

@Bhagadatta, Kutchkutch I've added automatic dialect categorization based on the table, since right now we are being a bit redundant in including the {{label}} with arguments for each dialect. Do you think those are still necessary? —AryamanA (मुझसे बात करेंयोगदान) 23:58, 4 September 2020 (UTC)Reply

@AryamanA: So where the dialect forms are identical to the actual entry, they'll get automatically categorized? This is a great idea! -- Bhagadatta (talk) 01:53, 5 September 2020 (UTC)Reply
@Bhagadatta: Yep! Removes the need to redundantly maintain the labels that way. —AryamanA (मुझसे बात करेंयोगदान) 02:05, 5 September 2020 (UTC)Reply
@AryamanA, Bhagadatta: Manually maintaining the {{label}} arguments for each dialect is very tedious, so automatising the categorisation based on the corresponding data module is certainly helpful. On the Desktop View, the table is collapsed by default. So if there are no headline labels, the table is collapsed and if the reader doesn't expand the table, the only indication of which dialects match the current page title is the categories at the bottom of the page. So the labels do have some purpose. If you don't think this is an issue, then go ahead with removing the labels. If this is an issue, then the tedious maintenance of the labels would be necessary unless the labels can also be generated automatically. This doesn't apply to the Mobile View since the table is visible without an option to collapse it.
Here are some other assorted issues about dialects that should be discussed: (Sorry about the long list!)
The Allahabad label corresponds with Kosambi in the table and the category CAT:Kosambi Ashokan Prakrit. Should this dialect be renamed to Allahabad-Kosambi to match the scheme of the other hyphenated dialects? (LHS = present location, RHS = original location)
There are spelling variants for some of the names (Kosambi vs Kaushambi, Lauria vs Lauriya, Khalsi vs Kalsi, Lumbini vs Rummindei), but the choice of one spelling over another doesn't seem to matter.
Since primary source for the dialectal data is Hultzsch and Hultzsch doesn't cover every dialect (such as Gujarra 𑀅𑀲𑁄𑀓 (asoka)), many more dialects may be need to be added in the future with additional research and sources. Outside of Prakrit, perhaps εὐσέβειᾰ (eusébeia) is the only example of how the Greek and Aramaic edicts would be covered.
At Talk:बद्ध, the possibility of w:Ashokan Prakrit was introduced to clarify questions about what Ashokan Prakrit is in addition to WT:About Ashokan Prakrit. Perhaps it hasn't been done yet because people have different ideas about classifying the language of the Ashokan inscriptions (except for the Greek, Aramaic, Mansehra and Shahbazgarhi ones). According to one viewpoint, the Girnar dialect is an old western Prakrit that shares areal features with Maharastri but is not the ancestor of Maharastri, and most of the other dialects are an old eastern Prakrit or Old Magadhi.
How should synonyms with different etymologies be handled? At 𑀧𑀼𑀮𑀺𑀲 (pulisa), {{syn|inc-ash|𑀫𑀼𑀦𑀺𑀲}} is used, while at 𑀳𑀺𑀭𑀁𑀦 (hiraṃna), the descendants of हिरण्य (híraṇya) and सुवर्ण (suvárṇa) are all in the same table. At 𑀅𑀕𑀺 (agi /⁠aggi⁠/), 𐨗𐨆𐨟𐨁 (joti) is in the table because it is used at Shahbazgarhi where 𑀅𑀕𑀺 (agi /⁠aggi⁠/) and 𐨀𐨒𐨁 (agi /⁠aggi⁠/) are used in the other dialects.
Should words be lemmatised or entered as attested? For example, Girnar 𑀅𑀙𑀢𑀺 (achati /⁠acchati⁠/) is actually attested as 𑀅𑀙𑀢𑀺𑀁 (achatiṃ) with a crack between 𑀙 (cha) and 𑀢 (ta) (Hultzsch page 24), so it was lemmatised to 𑀅𑀙𑀢𑀺 (achati /⁠acchati⁠/).
If there are descendants, the choice of which dialect to put them under depends on the word. For example, 𑀮𑀼𑀔 (lukha /⁠lukkha⁠/) and 𑀯𑁆𑀭𑀙 (vracha /⁠vraccha⁠/) have different descendants even though they are in the same table.
Should the {{alt form}} be removed from 𑀓𑀢 (kata), and should the descendants at 𑀓𑀝 (kaṭa) be moved to 𑀓𑀢 (kata)?
Templates like {{inc-ash-noun}} have not been made yet.
Should all dialects be included in Sanskrit descendant trees, or should only one be chosen?
The map is not entirely accurate. For example, at 𑀢𑀺𑀁𑀦𑀺 (tiṃni), the dots for Lauriya-Nandangarh, Lauriya-Araraj and Rampurva are shown in Nepal even though they are all in Bihar.
{{m|ts=}} parameter for the table in the data modules. Kutchkutch (talk) 11:39, 5 September 2020 (UTC)Reply
Ah, I did not see that problem with the table because about 90% of my wiktionary work is done in the mobile mode. Is there a way to deal with this? Would replacing the current table with a non-collapsible one be desirable? On the other hand, it may make a page unnecessarily long.
As I understand it, creating a wikipedia article for Ashokan Prakrit is a challenging task. I regretfully predict that Wikipedia editors might perhaps not respond well to our treatment of Ashokan Prakrit - the way we define it as being Proto-Middle-Indo-Aryan, even though this is the most practical and the best approach.
And they will not listen to us. Changes may be reverted. I've noticed, compared to wiktionary editors, wikipedia editors are less malleable to radical changes even if these changes are for good. Many of them have a really patronising, haughty know-it-all attitude without even knowing anything about the subject matter. Recently, it was revealed that over half of the content in the Scottish Wikipedia was written by someone who doesn't know the language and yet was treated as an authority in the subject and would use his admin powers to override helpful changes made by native Scots speakers. -- Bhagadatta (talk) 12:37, 5 September 2020 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Bhagadatta, Kutchkutch: Okay so I've made a lot of changes, and here are some of my thoughts:

  • Added a textual note above each table listing the dialects the pagename form is attested in. E.g. at 𑀅𑀁𑀩 (aṃba) it says "Attested at Allahabad-Kosambi, Delhi-Topra, and Lauriya-Araraj.". This should cover what the label data currently provides, so I can remove the labels by bot if you both agree that this alternative maintains the necessary information.
  • Added support for spelling variants in the dial data module pages (listed at the top of MOD:inc-ash/dial/data. Renamed "Kosambi" to "Allahabad-Kosambi". More dialects can be added there as needed.
  • I think synonyms should be maintained within each table, that way we get a more interesting view of dialectal differences. For example, the variation for "to see" at 𑀧𑀲𑀢𑀺 (pasati). This we should probably mention at WT:About Ashokan Prakrit. Also, I'm not sure if 𑀧𑀼𑀮𑀺𑀲 (pulisa) and 𑀫𑀼𑀦𑀺𑀲 (munisa) are truly synonyms, the former (at least in Sanskrit) appears to be more specifically male.
  • I think words should be lemmatized for consistency and the use of the lemma for comparative purposes, but in tables we can always note that the lemma form is not actually attested in a particular dialect.
  • I'm okay with keeping descendants of forms on the same table on the forms that the descendants best match with.
  • Dialects I think should all be listed in descendant trees if they have descendants listed of their own or otherwise are not merely scribal variants.
  • Are {{inc-ash-noun}} etc. necessary as of now? I could do a bot conversion for those as well if we feel the need to have those.
  • Fixed the map, was an error in the math for converting coordinates to pixels.

AryamanA (मुझसे बात करेंयोगदान) 23:02, 5 September 2020 (UTC)Reply

@AryamanA: Cool! I Support removing the labels, now that there's the automatically added note. -- Bhagadatta (talk) 02:14, 6 September 2020 (UTC)Reply
@AryamanA, Bhagadatta: Thanks for going through that long list and addressing most of those issues! The unaddressed issues can be dealt with if and when the need arises.
Now that there is an automatically generated note above the table, the labels on entries with tables can be removed. However, the labels are still necessary for words that are only attested in one dialect (without a table) unless they are later found in other dialects. For example, (assuming 𑀧𑀸𑀝𑀮𑀺𑀧𑀼𑀢 (pāṭaliputa) is only attested in the Girnar dialect), the label on the headline 𑀧𑀸𑀝𑀮𑀺𑀧𑀼𑀢 (pāṭaliputa) and the the automatic categorisation that it provides is still necessary.
The presence or absence of the serial/Oxford comma is a stylistic choice, but there an error at 𑀳𑀺𑀭𑀁𑀦 (hiraṃna) with:
Attested at Girnar, and Sopara.
["ts"] = "acchati" has been added to MOD:inc-ash/dial/data/𑀅𑀙𑀢𑀺, but it doesn't appear in the table at 𑀅𑀙𑀢𑀺 (achati). Maybe this feature is not operational yet.
The lack of {{inc-ash-noun}} etc. is just an observation, but not a necessary feature. Usually such templates get created when the number of lemmas for a particular language grows to more than a handful. Sometimes such templates provide language-specific features that are not provided by {{head}}.
While 𑀧𑀼𑀮𑀺𑀲 (pulisa) and 𑀫𑀼𑀦𑀺𑀲 (munisa) may not be synonyms in the strictest sense of synonym in all contexts, they might be considered synonyms in the looser sense (near-synonyms, parasynonyms). These two senses of the word synonym are elaborated upon at the English entry for synonym. Is/Should {{syn}} be used for this looser sense of synonym especially when there is a limit to the possible number of attested Ashokan Prakrit words? Or, would the See also section be better?
At त्रि (trí), the descendants were suppressed with {{see desc}}. While I agree with the principle of minimising very long descendants lists, my impression is most speakers of NIA languages are not very familiar with the finer distinctions of MIA (Ashokan, Ardhamagadhi, Sauraseni, Maharastri, Magadhi, etc). So an NIA speaker who has some familiarity with Sanskrit might be a bit confused to see Ashokan Prakrit instead of Hindi तीन (tīn) further down. Or, is this not relevant when determining whether to use {{see desc}} Kutchkutch (talk) 09:02, 6 September 2020 (UTC)Reply
@Kutchkutch: I almost never have a problem with a full list of descendants being shown but in this particular case the list was way too long. It's true that it may be a bit confusing to people not familiar with the details of Indo-Aryan chronology. Anyway a full account of etymology can be found by them at तीन (tīn) which even makes it clear that the word is not from त्रि itself, but from the neuter nominative plural त्रीणि (trīṇi). -- Bhagadatta (talk) 10:51, 6 September 2020 (UTC)Reply
@Bhagadatta: Although suppressing lengthy descendants trees is one of the primary reasons to use {{see desc}}, the documentation does not say anything about length. Perhaps something about length should be added to its documentation. Kutchkutch (talk) 11:24, 6 September 2020 (UTC)Reply
@Kutchkutch: I agree, that needs to be done. Some sort of a rule using which one can decide whether to use {{desctree}} or {{see desc}} should be there on that page like "if your blue-link descendant has one, two or three ultimate descendants max. then it's loss of info to require readers to click on a descendant to see further descendants, so use the former. On the other hand if there is a huge list of descendants, each with their own descendants then keep the page neat and compact by using the latter". -- Bhagadatta (talk) 11:44, 6 September 2020 (UTC)Reply
@Bhagadatta, Kutchkutch: Okay, I think the best policy is to maintain separate tables for exclusive usages of terms. Like in Girnar we find only pasati and not the various dakhati-type forms; there is a distinction in terms of which word got generalized to mean "see". This is not the case with munisa and pulisa, which were preserved side-by-side across dialects.
I sort of did a workaround for achati, feel free to use the format that I did until I figure out a better way of presenting the transcription. Also fixed the "and" issue, who knew producing sentences from code is such a pain!
Yeah, I think really long descendant trees have their pros and cons. I would much prefer some kind of automatically generated tree that can be expanded by the user (and is collapsed by default) but that seems to be beyond Wiktionary's memory loads haha.
Also, I started w:Ashokan Prakrit, feel free to edit it. —AryamanA (मुझसे बात करेंयोगदान) 04:10, 7 September 2020 (UTC)Reply
@AryamanA: The article looks great, thanks for creating it! -- Bhagadatta (talk) 07:23, 7 September 2020 (UTC)Reply
@AryamanA: Thanks for everything. I haven't edited Wikipedia in a long time, so I've forgotten about how editing there differs from en.wikt.
@Bhagadatta:
"Wikipedia editors might perhaps not respond well to our treatment of Ashokan Prakrit...they will not listen to us. Changes may be reverted..."
Well, no one's complaining yet. Maybe that's because the community that edits Indo-Aryan languages haven't found that article yet. How w:Ashokan Prakrit relates to Early Indian epigraphy will be interesting to see. Although the article might not go under w:Category:Memorials to Ashok, documenting w:Ashokan Prakrit is a memorial to Ashoka. Although there's a lot of fiction in Aśoka (film) and Chakravartin Ashoka Samrat, they help understand the circumstances in which w:Ashokan Prakrit was used and were the motivation for creating the proper nouns (𑀅𑀲𑁄𑀓 (asoka), 𑀓𑀸𑀮𑀼𑀯𑀸𑀓𑀺 (kāluvāki), 𑀓𑀮𑀺𑀁𑀕 (kaliṃga)), 𑀧𑀸𑀝𑀮𑀺𑀧𑀼𑀢 (pāṭaliputa), etc.). Kutchkutch (talk) 08:30, 7 September 2020 (UTC)Reply