Wiktionary:Grease pit/2019/March

Table templates[edit]

Does anybody know why Hittite inflection table templates {{hit-decl}}, {{hit-decl-adj}}, and {{hit-conj}} aren't collapsing anymore? They used to be rendered collapsed by default. – Tom 144 (𒄩𒇻𒅗𒀸) 14:16, 3 March 2019 (UTC)[reply]

@Tom 144: Fixed by removing a semicolon. (The semicolon was causing problems for jQuery in MediaWiki:Gadget-VisibilityToggles.js.) — Eru·tuon 20:26, 3 March 2019 (UTC)[reply]

@Erutuon: Thank you. –Tom 144 (𒄩𒇻𒅗𒀸) 20:28, 3 March 2019 (UTC)[reply]

Template for categorizing pseudo-Anglicisms, etc[edit]

Related to Wiktionary:Beer parlour/2019/March#Pseudo-X-isms_by_language, should we create one or more templates for categorizing entries as CAT:Pseudo-anglicisms by language, CAT:Pseudo-Italianisms by language, etc? We could either have different templates {{pseudo-Anglicism}}, {{pseudo-Italianism}}, etc, or one template like ~~{{pseudo-borrowing}}~~ {{pseudo-loan}} that would take language codes like ~~{{pseudo-borrowing|fr|en}}~~ {{pseudo-loan|fr|en}} (for "French pseudo-Anglicisms"). The latter would have to have its own (template-internal, non-Lua?) list of parameter-2-language-codes–to–text to account for "en" being "Anglicism" instead of "English", etc, at least for the relatively few divergent cases like that. (We could perhaps set it up to put unrecognized languages into a holding category we could check periodically to review the correctness of, or just to assume that anything that didn't have a "special" name specified should just use the language's canonical name + ism.) - -sche (discuss) 17:58, 3 March 2019 (UTC)[reply]

If we are to do this, definitely create a single template {{pseudo-borrowing}}, {{pseudo-loan}}, etc. Benwing2 (talk) 21:00, 3 March 2019 (UTC)[reply]

I'm not sure this can be always called pseudo-borrowings. In some cases, most speakers would tell you that they're still in the purported original language. In others, they're considered to be a language that doesn't actually exist. Then there are various stereotypical "foreign accents" that don't sound at all like anything real people would say. It's like speakers of the recipient language have this imaginary language in their heads that they may or may not identify with some real language (that they know nothing about). It has its own vocabulary, grammar and phonology- however limited- and is different from the speakers' own language- but also from any other language. In some ways it's like a pidgin, and in others like a conlang. Another parallel is spirit-possession languages, ritual languages, rhyming slang, thieves' cants and other specialized registers. Chuck Entz (talk) 22:53, 3 March 2019 (UTC)[reply]

Hmm... but (re your first sentence) anything that can't be called a pseudo-borrowing isn't covered by this; I'm only proposing to take entries which already describe themselves as e.g. pseudo-anglicisms in their etymology section, and templatize that description in the hopes of making such entries easier to track and regularly categorize. (The word "borrowing" or "loan" wouldn't show up anywhere in the entry, and I'm not proposing to categorize them as e.g. "derived from English", just to templatize the existing categorization as "pseudo-anglicisms" etc.) From my perspective, the fact that some cases might be unclear does not seem any more of an impediment to this template than to any other etymology template. - -sche (discuss) 23:38, 3 March 2019 (UTC)[reply]

Headers of Thai templates[edit]

I just noticed that headers have recently been removed from Thai language related templates. Formerly, the templates allowed users to add titles to the headers of drop-down lists, which worked somewhat like the temples "trans-top" and "trans-bottom". For example:

snake

อสรพิษ (à-sɔ̌ɔ-rá-pít)
โฆรวิษ
วิษธร
เงี้ยว (ngíao)
เงือก (ngʉ̂ʉak)
สัปปะ (sàp-bpà)
(deprecated template usage) {{trans-mid}}
สรีสฤบ (sà-rii-srìp)
ภุชงค์ (pú-chong)
ทีฆชาติ
ผณิ (pà-ní)
ผณิน (pà-nin)
อุรคะ (ù-rá-ká)

But, now, in the Thai templates, the title parameter does not work any more and the headers are not displayed any longer. For example, the following code:

gives the following plain list without title:

Template:th-syn

And this causes a mess to some entries where several lists are placed in the same section, because there's no title to tell this list is for which definition or that list is for which definition. An example is in the synonym section of the entry การเวก (gaa-rá-wêek). So, I think we should do something to fix this problem. --Miwako Sato (talk) 07:33, 4 March 2019 (UTC)[reply]

@Miwako Sato: I fixed Module:columns so that the title is displayed in the same way that it is displayed for {{der3}} and similar templates. If Thai editors in general prefer the way the list looked before, that can be achieved by replacing Module:columns with Module:columns/old in Module:th. — Eru·tuon 21:18, 4 March 2019 (UTC)[reply]

Thank you very much! --Miwako Sato (talk) 04:40, 5 March 2019 (UTC)[reply]

@Erutuon: I just noticed something. I think the header should be displayed only when the parameter "title" is used. But, now, when the parameter "title" is not used, the header displays the title of the section it belongs to instead, which is redundant. See the sections "Alternative forms" and "Derived terms" in the entry เจริญ (jà-rəən) for example. --Miwako Sato (talk) 06:24, 5 March 2019 (UTC)[reply]

@Miwako Sato: Yes. That was true before the list layout changed. The default header is set in Module:th. — Eru·tuon 06:34, 5 March 2019 (UTC)[reply]

Tocharian B Genders[edit]

I'm not sure if this is the exact best place to put this, but it's a technical question about Wiktionary machinery, and I see nowhere better to post it.

Tocharian B is a gendered language, but I don't see any simple way of showing the genders of nouns for the entries I create, unlike in Latin or Ancient Greek, which can have their genders easily shown through formatting text. Looking at the other Tocharian B noun entries that I didn't create, it doesn't look like any of them have their genders shown, which leads me to believe that there isn't any code set up to display Tocharian B genders.

Would it be possible to fix this in the Lua code, and, if not, how can I work around it? GabeMoore (talk) 19:36, 4 March 2019 (UTC)[reply]

@GabeMoore: the noun template {{txb-noun}} accepts gender as the second parameter, it seems that the issue is that nobody has added the genders yet. If you are not familiar with the use of these templates, they go immediately under the part-of-speech header (e.g. ===Noun===) and they look something like {{txb-noun|gandha|m}}. Hope that helps, and this is a perfectly reasonable place for this request. - TheDaveRoss 19:52, 4 March 2019 (UTC)[reply]

Thanks so much. A lot of the nouns have gender that either changes or is unknown in the plural. How would I denote this? GabeMoore (talk) 14:06, 6 March 2019 (UTC)[reply]

Obligate non-scriptio continua in a Mandarin Chinese example[edit]

Most of the time, we assume that Mandarin Chinese is an unspaced scriptio continua. But there is a political slogan in Mainland China that includes an obligate space between the two halves of the six character phrase, and it appears in the original text of an example that I am adding (and elsewhere- not idiosyncratic or a typo). Is there a way to add a space between the actual characters (not just a space in the pinyin) in a zh-x? Here's the page with the example: 四十埠 (Sìshíbù). --Geographyinitiative (talk) 20:46, 4 March 2019 (UTC)[reply]

A solution was found which can be seen on the page. --Geographyinitiative (talk) 01:45, 5 March 2019 (UTC)[reply]

Now I’m curious. Is it known what the rationale is for this unusual mandatory space? Is it to avoid an ambiguity in which 在行动 applies to just 雷锋 instead of the whole slogan 学雷锋? --Lambiam 22:02, 5 March 2019 (UTC)[reply]

Minor change to "Module:category tree/PIE root cat"[edit]

Could someone please modify "Module:category tree/PIE root cat" so that when it displays, for example, the description line "English terms that originate ultimately from the Proto-Indo-European root *ḱe-" on category pages, the link is to "Reconstruction:Proto-Indo-European/ḱe" and not "Reconstruction:Proto-Indo-European/ḱe-" (a redirect)? Also, perhaps there should be a full stop at the end of the description line. Thanks. — SGconlaw (talk) 09:17, 6 March 2019 (UTC)[reply]

I added the period to the category descriptions. The cause is actually {{PIE root}}, not the category tree module. {{PIE root}} always adds a hyphen to the PIE root in the category name; the hyphenless equivalent is {{PIE word}}. (I see I need to luaify {{PIE word cat}}.) — Eru·tuon 09:47, 6 March 2019 (UTC)[reply]

Oh, I didn't realize that the PIE element in question was a word rather than a root. Thanks. — SGconlaw (talk) 12:18, 6 March 2019 (UTC)[reply]

@Sgconlaw: I don't know if the distinction is very rigorous. Practically, you use {{PIE root}} if the entry name ends in a hyphen, otherwise {{PIE word}}. — Eru·tuon 20:46, 6 March 2019 (UTC)[reply]

{{PIE root}} is of course meant to be used for roots... —Rua (mew) 15:23, 8 March 2019 (UTC)[reply]

@Rua: Okay, so I just mean I'm not sure if *ḱe is considered a root or not given that it's written with a hyphen in its headword line but without in its entry name. — Eru·tuon 19:17, 8 March 2019 (UTC)[reply]

I wouldn't call it a root, as it doesn't take part in word formation the same way normal PIE roots do. PIE roots always begin and end with a consonant. —Rua (mew) 19:42, 8 March 2019 (UTC)[reply]

`{{ja-pron}}` is not suited for the Japanese dialects[edit]

@Poketalker, Mellohi!, Suzukaze-c, KevinUp, Eirikr, Dine2016 I tryed to add dialectal accents in the Japanese entries, but I found many problems on the template. Though it makes detailed IPA expressions, that phonological feature only can be applied to the Tokyo dialect. For examples:

Tokyo type (東京式) accents distinguish only by the position of the downstep, but Keihan type (京阪式) accents also distinguish the pitch at the beginning of word and Two-patterns type (二形式) accents have fixed into two patterns by the position from the end of phrase.
- In relation to difference of the accents, pattern of the vowel devoicing can be varied. Some Western Japanese dialects including Kansai don't have devoiced vowels.
Medial ザ行 /z/ is free allophone [dz/dʑ ～ z/ʑ] in Tokyo, but [z/ʑ] in the many areas in Western Japan including Kansai, [dz] in most areas in the Tōhoku region.
- Also there are dialects which distinguishe between ジ [ʑi]・ズ [zu] and ヂ [dʑi]・ヅ [dzu] in Kōchi, Miyazaki and Kagoshima prefectures.
ウ /u/ is unrounded [ɯ] in Tokyo, but rounded [u] in the Mie prefecture and westward, central vowel [ʉ] in the Tochigi prefecture and northward.
- In Nothern Tōhoku, part of ウ段 /u/ is merged with イ段 /i/ to be /ɨ/.
Niigata dialect preserves difference between /oː/ and /ɔː/ from the medieval era.
Some dialects have fused vowels /ɛ/, /ø/ and /y/.
Some Eastern Japanese dialects merges イ段 /i/ and エ段 /e/ into /e̝/.
Kagoshima dialect has final stop consonants ッ [ʔ̚ ～ t̚], ㇲ [s] and ㇱ [ɕ].

It's totally impossible to express these features at present state. Help by whom can tweak the template is needed.--荒巻モロゾフ (talk) 14:23, 9 March 2019 (UTC)[reply]

Um... what do you think of the “unified Japanese” approach on my user page? --Dine2016 (talk) 15:23, 9 March 2019 (UTC)[reply]

That's a neat idea. To explain the history of accent patterns, it's good enough to put Tokyo, Osaka and Kagoshima dialects (they complement each other from the mergers of the accent pattern from the proto form) under the entry "Modern Japanese" basically. In order to handle various phonemes and accents, it is better if there are fields to write IPA freely.--荒巻モロゾフ (talk) 16:12, 9 March 2019 (UTC)[reply]

We absolutely need to re-evaluate our templates. They are suited only for the modern Tokyo dialect, to the detriment of other forms of Japanese. I started a personal remake of ja-pron with this in mind, but I didn't like my code. Perhaps I should revisit it. —Suzukaze-c ◇◇ 05:53, 10 March 2019 (UTC)[reply]

FWIW, I fully support extending our {{ja-pron}} template to handle additional varieties of Japanese. I don't think the current state was ever intended to be the end-point -- we simply had to start somewhere, and most of the available information is about the so-called "standard" Japanese of the Tokyo region, as exemplified by NHK's broadcasts and reference materials. Building out support for additional dialects would be brilliant. ‑‑ Eiríkr Útlendi │^{Tala við mig} 19:02, 3 April 2019 (UTC)[reply]

Profiling Lua memory usage and processing time[edit]

@Rua, Erutuon, JohnC5 Anyone have any hints as to

Profile where exactly the memory usage and processing time of a given page is going?
Reduce the memory usage and processing time?

The only memory usage info I've been able to find so far is when you preview a page, it lists the total memory usage. But it doesn't break out the usage by function or module or anything, which would help a lot. I vaguely remember an old thing that listed the processing time per function but I can't find it any more.

As for reducing memory usage, I discovered at least for Module:quote that replacing precached references to require("foo") with calls to require each time I need to use a module reduces memory usage at the expense of processor time. This was enough to make black no longer throw memory-usage errors, although I'm really scraping at the margins; the majority of memory usage appears to come from the voluminous translation tables. Any other ideas? I imagine there must be ways of optimizing Module:links and/or the modules called by Module:links, and this would be useful, since they're used so often. Benwing2 (talk) 19:31, 9 March 2019 (UTC)[reply]

OK, scrap that, the reduction in memory usage was because some code was commented out. Reenabling that code bumps up the memory usage to where it was, leading me to think that avoid precaching modules doesn't help. Benwing2 (talk) 19:35, 9 March 2019 (UTC)[reply]

We could change the translation table format entirely, so that the whole contents is passed to a template and processed all at once. A major cause of memory usage is actually memory leaks between module invocations, so that the more modules are called on a page the higher the usage climbs. If we make the entire translation table one module call, it may work. —Rua (mew) 19:41, 9 March 2019 (UTC)[reply]

(edit conflict) Each page has a report on how much time the transclusions of each template have taken. It is in the HTML source code (search "Transclusion expansion time report") and can be accessed using JavaScript, but as far as I can tell isn't displayed anywhere. There doesn't seem to be any information on how much memory each module uses. — This unsigned comment was added by Erutuon (talk • contribs) at 19:44, 9 March 2019 (UTC).[reply]

@Rua Can you explain more? What sort of memory leaks are you referring to, how do they happen and how can they be eliminated? I did notice, for example, that the block that handles the chapter= parameter in Module:quote (lines 245-281) is where most of the memory is going. This code calls 3 modules but each one looks small, so I don't quite understand what's going on. Benwing2 (talk) 19:47, 9 March 2019 (UTC)[reply]

From past experiments, it seems that if you call {{t}} a ton on the same page, it uses more memory than just a single {{t}}. That doesn't make sense if you assume that each invocation is completed and all memory is freed before the next one is processed. Instead, it appears that the software processes all the invocations in parallel, but from a shared memory pool. Thus, the more invocations are processed in parallel, the higher memory usage becomes. At the same time, of course, the parallel processing speeds it up.

If the entire translation table is reduced to a single invocation, that limits the ability for each separate call to {{t}} to be done in parallel, and forces the whole thing to be processed serially as one thread of execution. Thus, instead of the invocations for each {{t}} adding up to the memory usage, you only have the memory used by that single transclusion.

There are other optimisations you can do as well with this method. With {{t}}, each invocation has to search for and import the appropriate language module, and this is then kept in memory for any future calls to {{t}} so that the whole thing doesn't slow down to a crawl. However, with translation tables we already know it's highly likely that we'll need all language modules. If we transclude all language modules in one go (via Module:languages/alldata) in advance, but don't keep them in memory after, that may be more efficient. —Rua (mew) 19:57, 9 March 2019 (UTC)[reply]

You're doing a lot of string concatenation in this block. This is bad and creates a lot of garbage that may not go away. DTLHS (talk) 20:01, 9 March 2019 (UTC)[reply]

@Rua, DTLHS Interesting ... if it's processing in parallel it must first divide the page up into chunks, because you only see the out-of-memory errors appearing after a certain point. However, I notice that all errors start after a certain point on the page, which can change slightly depending on code changes, which suggests that if it's parallelizing it must do it in small chunks. There are some very strange things about memory usage, though; as for the block in question in Module:quote, all the memory usage is coming from the call to Module:roman numerals, which adds 2 MB even though it's only invoked once (string concatenation doesn't appear to be the culprit). Guarding it with a regex call to see if the thing actually looks like a Roman numeral gets back that 2 MB. I don't see what that one call to that module could possibly be doing to gain 2 MB. Furthermore, adding cached modules at the top sometimes *reduces* memory usage, and triggering an error in the middle of the page in one spot also reduces memory usage. There's definitely something weird going on here. Benwing2 (talk) 20:32, 9 March 2019 (UTC)[reply]

@Benwing: DTLHS is certainly correct that Module:quote is using a lot of memory on string concatenation. This could perhaps be avoided by refactoring the code to store high-frequency strings in a submodule which is loaded with mw.loadData. I find also a bit odd that you haven't used Module:parameters (though this in itself might add extra overhead). Also, as @Victar may have informed you, we are also working on a Module:cite-meta project, which I think could be efficiently combined with Module:quote. That way, a lot of the parameter processing, string storage, and output styling could be unified in one place. The other thing that would be to get Module:Quotations to run through Module:quote as well, but that may be a project for the future. —*i̯óh₁n̥C ^[5] 21:13, 9 March 2019 (UTC)[reply]

I did an experiment on User:Rua/sandbox. Have a look at the source code for the page. The entire translation table contents is passed to the module Module:User:Rua/translations new, which then does its own simpler template parsing. It also directly loads all the language data at once, but still uses the regular Module:links for processing. The funny braces are just to prevent the software from treating it as templates before being passed to the module.

Results with the new attempt:

Lua time usage: 0.369/10.000 seconds
Lua memory usage: 12.88 MB/50 MB

Results with the original translation table from black:

Lua time usage: 0.780/10.000 seconds
Lua memory usage: 28.22 MB/50 MB

It's twice as fast and uses only half the memory. —Rua (mew) 20:39, 9 March 2019 (UTC)[reply]

It's too bad we can't do any real profiling and have to rely on trial&error. I opened phab:T188492 last year, but not much has happened. – Jberkel 20:44, 9 March 2019 (UTC)[reply]

Another wise proposal that has been made by @Erutuon is to alter the entrytitle generation code to be slightly more efficient. Right now it goes through each accented character and tries to replace them with the unaccented equivalent. Erutuon suggested that is might be faster in many cases to use mw.ustring.toNFD to decompose the diacritics and then remove them. Further efficiency might be gained by doing nothing at all if the NFC and NFD strings are the same length. —*i̯óh₁n̥C ^[5] 21:20, 9 March 2019 (UTC)[reply]

@Rua Your suggestion above of using the new code in Module:User:Rua/translations new is a great idea. Are there reasons not to adopt it, and if so what? We could use this for now only on pages that otherwise trigger memory errors, rather than trying to convert pages en masse. Benwing2 (talk) 22:30, 9 March 2019 (UTC)[reply]

We probably want a better solution than the weird Unicode braces I used, though. It was just a quick test/proof of concept. Someone else would need to work it out more to get it fully functional. —Rua (mew) 22:49, 9 March 2019 (UTC)[reply]

I've implemented the idea that John described, of a remove_diacritics field in the entry_name and sort_key fields of language data tables. Latin and Ancient Greek now use it. When I tested the change in the sandbox, I saw a significant decrease in memory usage in a page with a lot of Ancient Greek links, but that may have been due to the way I was testing it. At the very least it works. — Eru·tuon 01:58, 14 March 2019 (UTC)[reply]

@Erutuon: This is a great idea. It won't make a big difference, but a lot of African languages could use that as well. —Μετάknowledge^{discuss/deeds} 02:43, 14 March 2019 (UTC)[reply]

How many of our users like to see, use, or need to use the full set of translations simultaneously? Could users choose which subset of languages are to be served to them? Could users choose not to be served any translations? DCDuring (talk) 23:04, 9 March 2019 (UTC)[reply]

The problem with dealing with issues via settings is that a lot of users (even those with accounts) often won't be logged in. Equinox ◑ 23:09, 9 March 2019 (UTC)[reply]

Could we not read or infer their preferred language somehow? And couldn't translations be made available based on session-specific preferences? Encouraging registration seems like a good thing. Having preset clusters of translations (eg, language families or languages in a given script) could speed the process of selecting such preferences. DCDuring (talk) 23:17, 9 March 2019 (UTC)[reply]

@Rua I'm thinking it wouldn't be too hard to make your prototype fully functional:

In place of weird Unicode chars, we create templates {{tt}}, {{tt+}}, {{tt-check}}, {{tt+-check}} that are parallel to {{t}}, {{t+}}, etc. but do nothing except echo their arguments in some format, e.g. {{tt|ary|كحال|tr=kḥāl}} might generate ⦃⦃t¦ary¦كحال¦tr=kḥāl⦄⦄. These templates should be non-Lua so they don't add to the Lua overhead. This way, other template calls be still be embedded into the argument passed to the translations-new function.
In place of directly calling full_link from translations-new, extract out the appropriate code from Module:translations and call that, so the output is exactly the same whether translations-new is used or not.

If you think this is a good idea, I'll implement it. Benwing2 (talk) 01:21, 10 March 2019 (UTC)[reply]

@Benwing2 I'm fine with it. Be aware that such a "passthrough" would ignore unrecognised parameters, which normally trigger errors through the use of Module:parameters. It would be nice if we could retain the parameter checking somehow, either inside the passthrough template or directly in the new translation module. Perhaps the passthrough could be implemented in Lua, but in an extremely simple module that just iterates through args and spits everything back out again verbatim, leaving the translation module to do the checking. —Rua (mew) 15:11, 10 March 2019 (UTC)[reply]

@Rua That is a good point. I don't think it's very easy to implement the parameter checking inside a template. I think for the moment it doesn't matter all that much as this would presumably only be necessary on heavily trafficked pages, but once I implement it I will try implementing the passthrough in Lua the way you suggest, to see how much extra overhead it adds. An alternative is to not use a passthrough at all, but to use e.g. angle brackets and colons in place of braces and pipe signs; this should also work but might be a bit ugly. Benwing2 (talk) 19:36, 10 March 2019 (UTC)[reply]

@DTLHS, JohnC5 Are you sure it's using a lot of memory doing string concatenation? It's true it's doing some string concatenation but not generally of very large strings. The main way of building up the output is through the add() function, which adds to an array that is concat()ed at the end. I could rewrite all string concatenations inside of add() as multiple calls to add() if you think it really makes that much difference. As for Module:parameters, I didn't use that because the first pass is a direct port of Template:quote-meta/source. In a future pass I may fix it up to use Module:parameters. As for Module:cite-meta, this is the first I've heard of it. Is its goal to replace the Template:cite-meta template? That template works rather differently from Template:quote-meta/source (or at least its output is quite different), so I'm not sure how much code could be shared, although I imagine some of it. Feel free to hack on Module:quote or even move your code in Module:cite-meta into it. (If you're going to keep a separate module, I'd call it just Module:cite rather than Module:cite-meta.) Benwing2 (talk) 01:29, 10 March 2019 (UTC)[reply]

@Benwing2: I've always understood the most efficient method of string manipulation in Lua to be to create large tables of strings and then call :concat() at the end to avoid having intermediate phases lying around. Another way I've seen Rua do it is to put the final layout on another page, load that in and replace all the {{{param_names}}} in one fell swoop.

As for Module:cite-meta (which I agree should be Module:cite, if separate), we've been keeping this one under wraps to a certain extent. The notion around this one is to tag all the parts of the citation so that using CSS magic, you can control what citation type and how much to show. That way, the user who wants "simple, clean" citations can choose how much information is shown, and everyone else can get as much info as they want in whatever format. I think the quote templates could benefit from the same treatment. —*i̯óh₁n̥C ^[5] 02:12, 10 March 2019 (UTC)[reply]

`{{multitrans}}`[edit]

@Rua, Erutuon, JohnC5, Jberkel, DTLHS, Victar, -sche I implemented Rua's suggestion of processing the whole translation table together. See User:Benwing2/black-opt. This uses a template {{multitrans}} that surrounds a whole set of translation tables and requires that {{t}} inside of the template call be renamed to {{tt}}, and {{t+}} inside of the template call be renamed to {{tt+}}. (Things will still work if you don't rename, but you won't get the efficiency benefits.) This brings the memory usage down from 48M (just below the 50M limit) to 34M, and processing time from 2.09 to 1.29 seconds. There are probably further optimizations I could do, e.g. the implementations of {{tt}} and {{tt+}} aren't very smart about numbered params; with some additional work there to not generate blank numbered params, the memory usage might be brought down further. I haven't yet created pass-through templates {{tt-check}} and {{tt+check}} or documented {{multitrans}} except in the module code Module:translations/multi, but that isn't too hard. Benwing2 (talk) 22:48, 10 March 2019 (UTC)[reply]

I'm curious to know how it compares when there are multiple Translations sections on the page. Then you have to wrap each one separately. —Rua (mew) 22:51, 10 March 2019 (UTC)[reply]

You could also see what changes if you eliminate {{redlink category}} in the passthrough template. It does some things that may be rather expensive. —Rua (mew) 22:53, 10 March 2019 (UTC)[reply]

@Rua I will try eliminating the {{redlink category}} call. But I'm not sure what you're referring to about multiple Translations sections; in User:Benwing2/black-opt I used only two calls to {{multitrans}}, one wrapping all the adjective translation tables and one wrapping all the noun translation tables. Benwing2 (talk) 22:55, 10 March 2019 (UTC)[reply]

Ah, I missed that, never mind then. —Rua (mew) 22:56, 10 March 2019 (UTC)[reply]

@Rua Eliminating {{redlink category}} leads to essentially no observance difference in Lua memory or time. Benwing2 (talk) 23:01, 10 March 2019 (UTC)[reply]

Same story for optimizing the numbered params in {{tt}}. Benwing2 (talk) 23:05, 10 March 2019 (UTC)[reply]

I'm sure that's because of the opt-out list in the template, which has already eliminated any module calls through {{redlink template}} for entries on the list. The one unanswered question so far is how this interacts with the translation adder- does it get confused by the wrapper template and, if not, will it have to be modified to make this work? Chuck Entz (talk) 23:34, 10 March 2019 (UTC)[reply]

@Chuck Entz Yes, that will need to be modified. Benwing2 (talk) 00:42, 11 March 2019 (UTC)[reply]

BTW could someone point me to the JavaScript code for the translation adder? I'm not super familiar with JavaScript or how it works inside of Wiktionary. Benwing2 (talk) 00:49, 11 March 2019 (UTC)[reply]

Maybe it's this? MediaWiki:Gadget-TranslationAdder.js If so, I'll need some help fixing this as I'm not very familiar with JavaScript. Benwing2 (talk) 00:54, 11 March 2019 (UTC)[reply]

Yeah, that's it. I've wanted to improve it because it is kind of antiquated and messy, but I haven't quite grasped how it works (or MediaWiki:Gadget-Editor.js, which it depends on) so I haven't done much yet. — Eru·tuon 01:07, 11 March 2019 (UTC)[reply]

@Erutuon Any way you could hack on this? It should add {{tt}} instead of {{t}} if the translation-terms section is surrounded by {{multitrans}} (i.e. preceded by "\n{{multitrans|data=}}\n" and not preceded by a later occurrence of "\n}}\n"). If that's hard, you can look in the translation-terms section and see if there are any existing instances of {{tt}} or {{tt+}}. Also, when converting {{t}} to {{t+}}, it should correspondingly convert {{tt}} to {{tt+}}. I think the rebalancing code works fine without change. Benwing2 (talk) 05:43, 11 March 2019 (UTC)[reply]

@Benwing2: I'll take a look at it. It might simplify things to require that {{multitrans}} will enclose all of a given Translations section and not just part of it, so that the gadget does not have to determine where the template begins and ends, and can assume that the whole section should use {{tt}} and {{tt+}} for newly added translations. — Eru·tuon 08:56, 11 March 2019 (UTC)[reply]

Another optimization[edit]

Another optimization that we could do regarding memory is to change the way we currently structure the language data. Right now, we have a neat table of all the properties of a language, but in practice most of those properties go completely unused for the majority of use cases, yet we still have to import all the data and keep it in memory. Perhaps if we split the language data into smaller modules containing a subset of the data, then this data could be loaded only if it's actually needed. For example, most templates don't care about a language's parent or family, so we could put that in a separate module so that it can be loaded only when someone actually needs it. We could look into other pieces of data we could split off too to reduce the size of the data import needed. Perhaps even just one module per data point. Further splitting codes by second letter or something may also work, although that would increase the number of modules by a lot. —Rua (mew) 23:26, 10 March 2019 (UTC)[reply]

Template:ja-r[edit]

I created {{ja-r/multi}} and {{ja-r/args}} based on Rua and Benwing's method to reduce memory in Han character entries. They can replace {{ja-r}} and reduce Lua memory and time usage somewhat. Example:

{{der-top3}}
* {{ja-r|人%人|ひと%びと|rom=-}}, {{ja-r|人々|ひとびと|[[people]]}}
* {{ja-r|人%垣|ひと%がき}}
...
{{der-bottom}}

↓

{{der-top3}}
{{ja-r/multi|data=
* {{ja-r/args|人%人|ひと%びと|rom=-}}, {{ja-r/args|人々|ひとびと|[[people]]}}
* {{ja-r/args|人%垣|ひと%がき}}
...
}}
{{der-bottom}}

In 水 and 人, replacing the {{ja-r}} templates in one or two Japanese derived terms sections brings the pages under the Lua memory limit, so they no longer have module errors. — Eru·tuon 05:35, 11 March 2019 (UTC)[reply]

@Erutuon Awesome!!! Benwing2 (talk) 05:43, 11 March 2019 (UTC)[reply]

Using simpler linking functions[edit]

A technique that I used on several Appendix pages, like Appendix:Ancient Greek endings and Appendix:English doublets, and in some templates like {{grc-correlatives}}, {{Chinese-numbers}}, and {{ca-decl-ppron}}, is to link the terms without using Module:links at all. Module:links is complex because it has to handle many different types of input. For instance, it has to ensure that {{l|mul|:}} links to Unsupported titles/Colon and that {{l|en|word#Etymology}} links to word#Etymology and not to word#Etymology#English, separately process each nested link in {{l|en|[[this]] [[word]]}}, and so on. This increases Lua memory and processing time.

In many situations, many of these features are not needed, so resources can be saved by avoiding Module:links and the templates that use it. For instance, if a list of terms does not contain any hash marks (#), the hash-recognizing feature is not needed. If there are no punctuation marks or whitespace characters that are not allowed in titles, you don't have to check for unsupported titles. If there are no nested links, the linking function doesn't have to check for nested links and process them specially. One or more of these conditions is often true in lists of terms in a particular language and in inflection tables.

So for instance Module:grc-link handles Appendix:Ancient Greek endings, and as far as linking is concerned (the module does other more complicated stuff), it only has to remove some diacritics to get the correct entry name and then create the correct link syntax. Even though it contains more than two thousand links, the page renders quickly and uses very little Lua memory.

{{ca-decl-ppron}} uses a more complicated technique because you can't insert a table into a template. It invokes Module:quick link, which basically transcludes the content of another template and then processes it. I don't like the technique because it's hard to reproduce, but at least the template uses fewer Lua resources now.

{{Chinese-numbers}} and {{grc-correlatives}} also generate tables, but the wikitext is instead found in Lua modules (Module:zh-numbers and Module:grc-correlatives). The modules modify the wikitext to add links and then return it. This technique is easy to understand, but it is somewhat weird to embed a large amount of wikitext in a Lua module. — Eru·tuon 08:34, 11 March 2019 (UTC)[reply]

Lua error te[edit]

I was fixing an error on Tee#German (removed Dutch Low Saxon and added it to thee#Dutch, and moved West Frisian up), I was browsing some of the other entries and came across an error message at te#Swedish. I don't quite know how to fix this myself, it doesn't appear that my alteration causes the error to occur (reverting my changes did not cause the error message to disappear). Servien (talk) 23:02, 9 March 2019 (UTC)[reply]

Very interesting. See the immediately preceding discussion for an instance of similar symptoms. DCDuring (talk) 01:29, 10 March 2019 (UTC)[reply]

It's a Lua memory error. The page has had it for ages; it's certainly not your fault. — Eru·tuon 07:00, 10 March 2019 (UTC)[reply]

For whatever it's worth, during the process of making a number of edits to the page replacing {{l}}s etc with bare wikilinks with manual language-section-tagging, I noticed that each one or two removed wikilinks generally reduced memory enough that another one or two lines (of {{head}}s, {{label}}s, or declension-table lines) worked, basically pushing the start of the error down the page (as if each of them was using roughly as much memory as the other). - -sche (discuss) 07:23, 10 March 2019 (UTC)[reply]

Cleanups of etymology templates[edit]

Just a heads-up that I am planning of making the following changes:

Changing {{back-formation}} not to output a final period, and fix up uses of this template so that if nodot= is present it's removed, and otherwise a period is added after the template. (I think this is the only such template that still adds a period at the end.)
Changing all occurrences of |lang= in {{back-formation}}, {{clipping}}, {{deverbal}}, {{deverbative}}, {{ellipsis}}, and {{blend}} to use |1=, and move all other numbered parameters over by one. All these templates currently support both |1= and |lang= for specifying the language. After this change I'll remove support for |lang= in these templates.
Renaming calls to {{deverbative}} to use {{deverbal}}.

In the longer run I'd like to do the following, but I won't do them yet as they may be controversial:

Rewrite all uses of {{prefix}}, {{suffix}}, {{confix}} and maybe {{compound}} in terms of {{affix}} and eventually eliminate those preceding three or four templates.
Rewrite all uses of {{circumfix}}, {{infix}}, etc. that use |lang= to use |1=, moving the numbered parameters over by one; and then eliminating the compatibility support for |lang=.

Benwing2 (talk) 05:59, 11 March 2019 (UTC)[reply]

You should be careful about converting them to {{affix}} because that may give the wrong result if there are hyphens in the terms. For example, with PIE root plus suffix, the root should not be treated as a prefix despite the hyphen. —Rua (mew) 10:47, 11 March 2019 (UTC)[reply]

Probably forgot {{doublet}}, {{univerbation}}, {{reduplication}}, {{rebracketing}}, which are of the same code. Fay Freak (talk) 13:45, 11 March 2019 (UTC)[reply]

Is there some way to have templates like {{prefix}} and {{suffix}} cater to situations where the various elements of a word originate from different languages? It is not uncommon, for example, for a word to have a stem that is from Greek or Latin, and then an English affix like -ic or -ity. — SGconlaw (talk) 14:24, 11 March 2019 (UTC)[reply]

@Sgconlaw: The way I do it is using {{der}} for the foreign word, plus {{suffix|en||ic}} (or whatever), or {{prefix|en|[prefix]|}} plus foreign word. DonnanZ (talk) 17:30, 11 March 2019 (UTC)[reply]

|langN= “The language code to use for this particular part” which stands at {{affix}}, is for all such cases an I err not. Fay Freak (talk) 15:45, 11 March 2019 (UTC)[reply]

Thanks for making templates more consistent. Didn't know we had {{univerbation}}. Shouldn't it take the terms which make up the new word as parameters, @Fay Freak? – Jberkel 17:14, 11 March 2019 (UTC)[reply]

@Jberkel: It could, but this is above my coding capabilities. Also often you write something between the two terms, or it is one link already, for example on عَبْقَر (ʕabqar); عَيْن الْبَقَر (ʕayn al-baqar) can be an entry and عَبْقَر (ʕabqar) is, the same with قُرْوُسْطِيّ (qurwusṭiyy); or there isn’t a term that can be linked and it is SOP, as on Langschwert; one would also need to consider in the coding that one of the parts can be in a different language code-switched into the language or the like. The template has been created because it still eases, linking to the appendix correctly and categorizing. Fay Freak (talk) 17:49, 11 March 2019 (UTC)[reply]

OK. In addition to the above, I'm going to do the following:

Rename lang= to 1= on the other templates mentioned by User:Fay Freak: {{doublet}}, {{univerbation}}, {{reduplication}}, {{rebracketing}}, as well as synonyms.
Reduce some synonyms. In particular, I notice the following synonyms:
1. {{back-formation}} has synonyms {{back-form}}, {{backform}}, {{bac}} and {{bf}}. I see no reason for having so many synonyms, and only {{back-form}} and {{bf}} are mentioned in the docs, so I'll rename {{backform}} -> {{back-form}} and {{bac}} -> {{bf}}. In general I think we only need one synonym, which should be shorter than the original and short enough to type easily, and we only need such synonyms if the original is sufficiently long as to make typing it in full be annoying, hence {{blend}} doesn't need a shorter synonym. It's bizarre, for example, that {{calque}} (which is already pretty short) has four synonyms: {{cal}}, {{calq}}, {{clq}} and {{loan translation}}. Such profusion of synonyms IMO serves little purpose and just makes bot processing harder and more error-prone.
2. {{doublet}} has synonyms {{doublet of}} and {{etymtwin}}, which will be renamed to {{doublet}}.
3. {{metanalysis}} will be renamed to {{rebracketing}}.
4. {{reduplicated}} will be renamed to {{reduplication}}.
Convert these templates to Lua. I'll do that after the cleanup. All except {{blend}} and {{univerbation}} are basically a single link + some additional text and a category, and should be implemented with a single Lua function, with another function to handle {{blend}} and {{univerbation}} (similar to the code that implements {{compound}}, and maybe sharing that code).

Benwing2 (talk) 01:29, 12 March 2019 (UTC)[reply]

@Rua Thanks for pointing out the issue with PIE roots and such. I think {{affix}} should be extended to support such things (terms that aren't affixes but look like affixes) by using a special character to prefix such non-affixes. There are various reasons for doing this besides just making it easier to convert {{prefix}} and {{suffix}}; you might, for example, want to say something like {{affix|ine-pro|*prō-|^*bher-|*-ye-|*-ti}} (where I've used the caret ^ to indicate that the term should not be interpreted as an affix) and not have to worry about which (if any) of the *fix variants to use. What should that character be? Asterisk (*) is out because that indicates reconstructions; backslash (\) is a possibility as it has a similar function in code, but it might look ugly; exclamation point (!) is a possibility but I use it in a different (almost opposite) sense in {{affixusex}} and sisters; what do you think of caret (^)? Other possibilities are e.g. tilde (~), pound sign (#) or percent (%). Benwing2 (talk) 01:40, 12 March 2019 (UTC)[reply]

OK, I finished the first part of the cleanup and converted all the etymology templates to Lua. {{blend}} and {{univerbation}} now take multiple parts, similar to {{compound}}, and the others take a single part. Now I'm going to do the following: rewrite all uses of {{suffix}}, {{prefix}}, {{circumfix}}, {{infix}}, etc. that use |lang= to use |1=, moving the numbered parameters over by one; and then eliminate the compatibility support for |lang=. Benwing2 (talk) 02:03, 13 March 2019 (UTC)[reply]

I implemented support in {{affix}} for a caret (^) to mean non-affixal interpretation. I also fixed another issue preventing use of {{affix}} in East Asian languages, where the language-specific hyphen character was set to the empty string so that {{prefix}} and {{suffix}} wouldn't automatically add it. This setting formerly made it impossible to use {{affix}} for East Asian prefixes and suffixes. I fixed things so that you can use a regular hyphen to indicate an affix in these languages, but the displayed and linked term won't have a hyphen. I also fixed several edge-case bugs (e.g. the hyphen wasn't correctly added to translit in {{prefix}} and {{suffix}}, although the code clearly intended to do), and cleaned up a lot of duplicative code. Benwing2 (talk) 01:42, 15 March 2019 (UTC)[reply]

I think I've cleaned up all the cases of {{doublet}} with no term from the dump. Changes included merging {{doublet}} with a following {{m}}, or adding "of" after "doublet" because Benwing2's changes removed it, or making proper use of the |notext=1 option to hide "doublet of", or replacing the word "doublet" with the template rather than hiding the template (because removal of "of" makes that possible). — Eru·tuon 08:15, 27 March 2019 (UTC)[reply]

Should the parameter for “the term that this term was derived from” be optional in {{phono-semantic matching}} like it is in {{doublet}}? Since both do not categorize by the source language but dump all in the respective language category of Category:Phono-semantic matchings by language and Category:Doublets by language I tend to assume yes. A usage examples for this would be دِين (dīn) since it is said to be “a historically conflated term derived from multiple layers of phono-semantic matching”, and one wants to use it there and in alike entries to link the appendix entry and categorize. I have understood that for {{univerbation}} and {{blend}} also have the parameter only optionally though this is not reflected yet. {{doublet}}, {{blend}} and {{univerbation}} are sorted however as Morphology templates (though I don’t know what doublets have to do with morphology) and {{phono-semantic matching}} as foreign derivation template and all “foreign derivation templates” do not permit to omit a term, though only some of the templates there categorize by the source language. You see, there is some inconsistency, though surely this is to the greatest part owing to the nature of the relations described by the derivation templates. This might insinuate a different subcategorization of the etymology templates but I do not fancy anything particular, @Benwing2.

Or expressed all shorter and in sequential order, one must examine 1. for all etymology templates, if the fact that a source term is a mandatory parameter is reasonable 2. if the fact whether a source term is a mandatory parameter is reflected in the documentations of the etymology templates consistently so one does not have to guess or try 3. for all etymology templates, if the fact that there is no categorization by the source language is reasonable 4. if the categorization of the etymology templates themselves is reasonable. Fay Freak (talk) 23:33, 27 March 2019 (UTC)[reply]

@Fay Freak I can fix this. Benwing2 (talk) 01:05, 28 March 2019 (UTC)[reply]

Cleanups of form-of templates[edit]

While we are at it, I mention the Form-of templates. From olde times, when they supposed that the language is English is not specified, they take the linked word as first entry and only |lang=, not a positional language parameter. While a change of this would be rather large, though probably consequent, and on topic since some imply etymology like {{clipping of}}, I mention these templates because of |nocap=, |nodot=, that constitute an annoyance. Some have a dot at the end by default, some not. This is irritating and cannot or should not be memorized by the user. Furthermore it wastes precious effort of the fingers to type this parameter and “|nocap=1” every time using the templates, particularly {{alternative form of}} (which puts no dot by itself, but has the capitalization problem). As with the now standing practice of capitaliation and dots in glosses, the setting should be inferred from the language code, i. e. dot and capitalization for the line in English entries and no dot and no capitalization in Arabic entries, unless specified otherwise. Fay Freak (talk) 13:45, 11 March 2019 (UTC)[reply]

Yes, I would support standardization of this category of templates. As they are mostly used in definitions, I would support the addition of a full stop at the end of the statement as the default, coupled with the ability to specify |nodot=1 or |nodot=yes to turn it off. — SGconlaw (talk) 14:18, 11 March 2019 (UTC)[reply]

Default for English, to be sure. In other languages the definition lines shan’t have dots nor begin capitalized. Fay Freak (talk) 15:42, 11 March 2019 (UTC)[reply]

I see no reason why English should be formatted differently. There should not be no dot or capital for English if it's not there for any other language. —Rua (mew) 17:22, 11 March 2019 (UTC)[reply]

I see neither any and I am a strongly against this formatting in English, I only was stating the current “rule” deriving from practice. Dots at the end of the lines are noise, they don’t bear information. Similarly I am also against dots at the ends of footnotes though some professors seem to think that they are required because footnotes are “sentences”. The capitalization even makes distinctions go away, say when there is an English word meaning one thing and a word written the same but capitalized meaning another thing. Plus the consistency argument. Fay Freak (talk) 17:49, 11 March 2019 (UTC)[reply]

I have to disagree, because I believe definitions should be treated as sentences and capitalized and punctuated as such. In some cases definitions consist of more than one sentence, and so will have to be punctuated. For consistency, those which consist of only one sentence should be given the same treatment. — SGconlaw (talk) 18:06, 11 March 2019 (UTC)[reply]

Only in those cases. Then the dot is a separator. Else you could as well use a semicolon, bar, star, or something ornate. It is arbitrary to see the glosses as “sentences” and to put dots behind them. In fact the word “gloss” tells us that they aren’t sentences, any more than in a Medieval manuscript if a word is explained in a gloss it ends with a dot. Also if they were sentences the headwords would be SOP. You understand this one? The glosses translate parts of sentences, phrases, but generelly not sentences (sometimes as glosses of interjection, but this is exceptional and can also be omitted). Putting a dot after glosses and footnotes is simple hypercorrection. It is like ending parentheses with dots, wrong the same way. John – some guy I know from school. – has given me this book. This wrong. Fay Freak (talk) 19:59, 11 March 2019 (UTC)[reply]

Oh, and by the way there is a practice not to end simple etymologies with dots, apparently standard in Russian entries. I remember having observed @Benwing2 removing such dots; so it looks on перепра́вить (pereprávitʹ). A dot wouldn’t do anything here, neither when we put the word “from” before the derivation. Saying this or that is “a sentence” because it is the abbreviation of a “sentence proper” or an imagined sentence in the strict sense and hence must be treated the same is just essentialist delusion. The question must be what is brought about with the signs. You might observe that it is not wrong at all if, when one nowadays chats, one sends single sentences without punctutation marks. This is because they don’t add any meaning, or even dilute it because the speaker did not want to decide between dot, …, !, ?. In a “serious” work it cannot be otherwise for single full sentences. One sometimes puts the dots in these because of the dogged old expectation that “every sentence must end with a punctuation mark” but there isn’t even this expectation with database-like entries on the internet. Fay Freak (talk) 20:22, 11 March 2019 (UTC)[reply]

@Fay Freak, Rua, Sgconlaw English is fundamentally different from foreign languages because the definitions of English terms are paraphrases using other English terms (normally full sentences or long phrases), while the definitions of foreign terms are English equivalent terms (often single words). For example, the definition of English umbrella is written as follows:

Cloth-covered frame used for protection against rain or sun.

while the definition of Portuguese guarda-chuva is written as follows:

umbrella

Note the difference in formatting. This is fairly consistent across Wiktionary and is the reason I remove final periods when they occur in Russian definitions. For this reason I think the use of capital letters and periods in templates like {{alternative form of}} should be different for English vs. foreign languages: Capital letter and period in English, lowercase letter and no period in foreign languages. Benwing2 (talk) 01:47, 12 March 2019 (UTC)[reply]

— SGconlaw (talk) 01:57, 12 March 2019 (UTC)[reply]

I agree with Benwing. - -sche (discuss) 05:46, 12 March 2019 (UTC)[reply]

That's for full definitions. But form-of templates are not full definitions, they are glosses in all languages, so they shouldn't have a period. It's inconsistent when the exact same definition gets a period in one language and not others. —Rua (mew) 11:08, 12 March 2019 (UTC)[reply]

@Rua There is no consistency. {{misspelling of}}, for example (and other templates using {{deftempboiler}}) do include a final period; see e.g. concious, mispronounciation. {{alternative form of}} (and others using {{#invoke:form of|form_of_t}}) do not. But both include a capital letter at the beginning, which is wrong for non-English languages. If we are to follow your idea that form-of templates are glosses, there shouldn't be either a capital letter or a period in any language. Otherwise, we should have both capital letter and period in English, but not any other language. Either way, the current inconsistent situation needs to be fixed. Benwing2 (talk) 03:20, 13 March 2019 (UTC)[reply]

We may have to have a poll or vote on this. — SGconlaw (talk) 11:58, 13 March 2019 (UTC)[reply]

This only concerning the formatting in English, if this is supposed to change. I was here concerned about the default formatting and dotting for formatting in foreign languages, which I said should be inferred from the language code, made consistent, this not being in doubt here. How English lines should be formatted is a divisible issue here. @Benwing2 Also we have missed {{unknown}} in the template cleanup, it shares also that code. Fay Freak (talk) 17:06, 13 March 2019 (UTC)[reply]

The issues of English vs non-English can't be separated. If the same definition occurs in both languages, it should be formatted the same way too. "Plural of" definitions should not be capitalised in one language but not another, that's inconsistent. —Rua (mew) 20:35, 14 March 2019 (UTC)[reply]

We will need a poll, I think. Even within English the usage is totally inconsistent, e.g. Template:misspelling of has a capital letter and period, Template:alternative form of has a capital letter and no period, and Template:en-past of has no capital letter and no period. So far, User:Rua is the only person disagreeing with my suggestion of handling periods and capital letters in English vs. other languages. BTW @Fay Freak I cleaned up {{unknown}}, {{onomatopoeic}} and {{spelling pronunciation}} to use only |1=, not |lang=, and obsoleted/deleted {{unk.}} (use {{unk}} instead) and {{Onomatopoeic}} (use the lowercase equivalent). Benwing2 (talk) 00:08, 15 March 2019 (UTC)[reply]

@Fay Freak Also {{adverbial accusative}}, and fixed up that one and {{unknown}}, {{onomatopoeic}} and {{spelling pronunciation}} to use Lua (which should flush out some bad param usages). Benwing2 (talk) 00:32, 15 March 2019 (UTC)[reply]

Character insertion table inserting some characters twice[edit]

I use that character insertion table – not entirely sure what it is called – that appears below the edit window, to insert characters and symbols. In the "Miscellaneous" menu, clicking on the superscript "a" and "o" (ª, º), for some reason, inserts two symbols instead of one. I wonder if this can be fixed. I guess this is quite a minor point, but it would be nice if someone could look into it. Thanks. — SGconlaw (talk) 09:04, 12 March 2019 (UTC)[reply]

@Sgconlaw: Thanks for the report. Fixed. — Eru·tuon 19:59, 12 March 2019 (UTC)[reply]

Thanks! It's working as expected now. — SGconlaw (talk) 11:56, 13 March 2019 (UTC)[reply]

Issue with Module:ko-headword at 타일[edit]

Hello,

The new Korean entry 타일 (tail) currently shows empty brackets: 타일 • () when it should show 타일 • (tail) - with a transliteration in brackets. (Notifying TAKASUGI Shinji, HappyMidnight): . --Anatoli T. ^{(обсудить}/^вклад) 00:42, 14 March 2019 (UTC)[reply]

Hmm, I had to create two entries (for now): 타일 and 타일. The former is bad but what's wrong with it and what's the difference? --Anatoli T. ^{(обсудить}/^вклад) 00:45, 14 March 2019 (UTC)[reply]

The first one has invisible Unicode characters at the front: when I hover over the link in Firefox, I get %E2%80%8B%E2%80%8B타일. I don't know why the translit wouldn't appear, but perhaps it's a helpful bug? —Suzukaze-c ◇◇ 04:03, 14 March 2019 (UTC)[reply]

(edit conflict) @Suzukaze-c: Thanks! that's the first thing I suspected but I haven't checked very well. There are two ZWNJ characters. Yes, it's good that the module failed but without a descriptive message. It fails on South East Asian languages with an error message. Deleting the entry now. --Anatoli T. ^{(обсудить}/^вклад) 04:17, 14 March 2019 (UTC)[reply]

타일 contains two zero-width spaces (U+200B). I’ll delete it. — TAKASUGI Shinji (talk) 04:15, 14 March 2019 (UTC)[reply]

Too late, gone :) Thank you both. --Anatoli T. ^{(обсудить}/^вклад) 04:17, 14 March 2019 (UTC)[reply]

@TAKASUGI Shinji, Suzukaze-c: BTW, I created the bad entry from a translation at [[tile]], not my copypasta error. --Anatoli T. ^{(обсудить}/^вклад) 04:19, 14 March 2019 (UTC)[reply]

Turns out there are a fair number of pages with zero-width spaces. — Eru·tuon 04:46, 14 March 2019 (UTC)[reply]

@Erutuon: Yes, thanks, many South East Asian entries copied from Sealang dictionary or Wikipedia need to be quality-checked. Proper correct entries with automated transliteration won't even work for languages such Thai, Khmer and Burmese. --Anatoli T. ^{(обсудить}/^вклад) 04:59, 14 March 2019 (UTC)[reply]

@Atitarev: It seems like Module:ko-headword was programmed to not transliterate if the entry isn't pure hangeul, which results in empty parentheses. I've changed it so that it produces a meaningful error message instead. —Suzukaze-c ◇◇ 04:48, 14 March 2019 (UTC)[reply]

@Suzukaze-c: Thanks, we need to check for exceptions, though. Arabic numbers, etc, may be part of valid titles, e.g. 7월 (7wol, “July”) = 칠월 (chirwol, “July”). The former is more common and a standard way of writing months in Korean. --Anatoli T. ^{(обсудить}/^вклад) 04:59, 14 March 2019 (UTC)[reply]

Actually, I simplified my statement a bit. The module doesn't complain if the hangeul parameter is provided (as with hanja terms like 韓國). diff is my solution. —Suzukaze-c ◇◇ 05:23, 14 March 2019 (UTC)[reply]

@Suzukaze-c: Thanks, I didn't think about hangeul. rv= parameter is still required, if the reading is completely irregular, unless we respell the words completely, e.g. respell 십육 as "심뉵". If we do, we may consider adding long vowels but there's too much to do. Think about all geminations and all cases described at Template:ko-IPA/documentation. --Anatoli T. ^{(обсудить}/^вклад) 05:32, 14 March 2019 (UTC)[reply]

Hm, I see. Perhaps we could try doing whatever the Thai infrastructure is doing (reading {{th-pron}}→reading {{ko-IPA}}). —Suzukaze-c ◇◇ 05:46, 14 March 2019 (UTC)[reply]

I don't object to have a more phonetic transliteration. The default transliteration is fine too but if a term is respelled with some additional parameters, we might as well transliterate eg 음식값 (eumsikgap) as something like "ēūmsikkap". "Tuttle Learner's Korean-English Dictionary" is already using a very phonetic transliteration, no long vowels are catered for, though. --Anatoli T. ^{(обсудить}/^вклад) 05:57, 14 March 2019 (UTC)[reply]

I agree (I've also wondered in the past about including long vowels in the romanization), but I also haven't studied enough Korean, and wouldn't feel confident modifying Module:ko-pron appropriately _(：3 」∠ )_ —Suzukaze-c ◇◇ 06:09, 14 March 2019 (UTC)[reply]

The current pronunciation module is perfect and the quality of most entries is very high. We just need to merge "pron" with "translit", automate things. One of the reasons for not transliterating long vowels is digraphs, the other, maybe, the vowel length is not too prominent, semi-long and optional (?). Very phonetic transcription will invariably confuse someone who just want to know the script and understand etymologies a bit more. But we already have two parts in the pronunciation box: eg at 선로 (線路, seollo): Revised Romanization "seollo" and Revised Romanization (translit.) "seonlo" --Anatoli T. ^{(обсудить}/^вклад) 06:20, 14 March 2019 (UTC)[reply]

Long digraphs are supposed to be encoded by U+035E COMBINING DOUBLE MACRON after the first letter. e.g. e͞u. Of course, font support may be poor. --RichardW57 (talk) 06:37, 25 March 2019 (UTC)[reply]

Ancestors in Module:etymology languages/data[edit]

Recently the Proto-Oghuz language was moved to Module:etymology languages/data during Victar's work on the Turkic languages. It is still set as the ancestor of Old Anatolian Turkish, which is the ancestor of Ottoman Turkish, which is the ancestor of Turkish. To let the language modules find Proto-Turkic, the parent language of Proto-Oghuz, as an ancestor of Turkish (and prevent module errors in etymologies such as ötmek), Proto-Oghuz has trk-pro (Proto-Turkic) given as its ancestor.

The etymology templates can't interpret the parent value in the data module as an ancestor, because it isn't the same thing. For instance, qfa-sub-grc (Pre-Greek) has the parent qfa-sub (substrate), but Pre-Greek does not descend from "substrate". Similarly, American English (en-US) isn't a descendant of English (en), it's a subvariety. So an ancestors value has to be given. This is the only ancestor given in the module; an etymology language only needs an ancestor if it in turn is the ancestor of a regular (non-etymology) language.

Writing this note for DTLHS mainly, who removed the ancestors for trk-ogz-pro (Proto-Oghuz), and because the discussion on this issue was held in Discord. User:KevinUp, Crom daba, Victar, Surjection, and I were involved. — Eru·tuon 05:52, 14 March 2019 (UTC)[reply]

Indeed, we cannot rely on the parent field for ancestors because it doesn't really make sense for some of those languages. That doesn't mean it couldn't be changed to be that way, but that requires wider changes and the ancestors field is the only proper solution right now. — sur jec tion ⟨?⟩ 11:42, 14 March 2019 (UTC)[reply]

Transcription of Proto-Norse[edit]

Can someone fix so that transcriptions don’t show up when having a Proto-Norse reconstruction with Latin letters within t:desc, t:m etc.?Jonteemil (talk) 18:01, 14 March 2019 (UTC)[reply]

@Jonteemil:

Done. Just had to add Latn to the scripts for Proto-Norse. — Eru·tuon 20:31, 14 March 2019 (UTC)[reply]

This doesn't seem like the right way to go. Proto-Norse was never written in the Latin script, so why do we have reconstructions in Latin script? Would we have reconstructions for Gothic, OCS or Ancient Greek in Latin script? —Rua (mew) 20:33, 14 March 2019 (UTC)[reply]

Adding Latin to the list of scripts is not meant as a declaration that Proto-Norse should be written in Latin script, only as an acknowledgement that it is and that the modules have to know that fact or they will generate a transliteration for a Latin-script word and add rune-specific classes to it that can make it display with strange fonts for those with rune-appropriate fonts installed. If Proto-Norse shouldn't be written in Latin script, someone can go replace Latin script with runes or move it to the transliteration parameter, or whatever is necessary. — Eru·tuon 20:40, 14 March 2019 (UTC)[reply]

Now it shows up in Category:Proto-Norse language, which certainly does seem like a declaration that Proto-Norse can be written in that script. —Rua (mew) 20:48, 14 March 2019 (UTC)[reply]

Unfortunately language categories can only reflect our actual language data, which in turn is dictated by practical considerations and not by various other theoretical ideals. (It is also impossible at the moment to have the family tree simultaneously display Scots as the descendant of Northern Middle English and Northumbrian Old English and display Northern Middle English and Northumbrian Old English as subvarieties of Middle English and Old English respectively, as would be more accurate than displaying Scots as the descendant of Middle English and Northumbrian Old English and Northern Middle English separately.) If you have another solution to the problem of Old Norse written in Latin script being displayed with Runic fonts and having transliteration added to it, please tell me. — Eru·tuon 21:22, 14 March 2019 (UTC)[reply]

I think Rua already alluded to the solution, which is to have reconstructions use runic script. —Μετάknowledge^{discuss/deeds} 21:26, 14 March 2019 (UTC)[reply]

Sorry, I was wrong. I realize what should be done is to move the Latin script to the transliteration parameter. I'll undo my edit. — Eru·tuon 21:57, 14 March 2019 (UTC)[reply]

The reason why the reconstructions are in the Latin script is because Svensk etymologisk ordbok have them in Latin characters. Also, I unfortunately don’t know which rune correspond to which letter.Jonteemil (talk) 23:15, 14 March 2019 (UTC)[reply]

You should put Latin-script versions in the transliteration parameter (|tr= or |tr1=, |tr2=, etc. depending on the template). They will be tracked in Category:Proto-Norse terms needing native script and someone can add the Runic version. The current practice seems to be that reconstructed Proto-Norse terms are rendered in Runic script (see Category:Proto-Norse lemmas, which includes some terms in the Reconstruction namespace). — Eru·tuon 23:59, 14 March 2019 (UTC)[reply]

Ah, that makes sense!Jonteemil (talk) 16:11, 15 March 2019 (UTC)[reply]

IP Ban[edit]

hello!

this is not a constructive post. the constructive one was denied bc of specific spammer habits. here it is: hello! anyone considered an etymology based on "Magog" https://en.wikipedia.org/wiki/Gog_and_Magog on the "demagogue" discussion page.

obviously, my ip is banned. the reason: https://de.wikipedia.org/wiki/Benutzer:Baumfreund-FFM/R%C3%BCckblick#Administrative_Beitragszahlen_zum_Jahreswechsel_2016/17_(31._Dezember_2016) deletes 24thousands entries per year. at 240 work days, thats 100 per day. on a 10hrs day its 10 per hour. this guy slaughters one entry after the other. every 6minutes. hour for hour, day for day, year for year. my entry the other day impressed him so much that he woke up and banned me. that happens only 900 times a year due to him. only 3 times a day. day for day, year for year.

because its not possible to slaughter the entries on wikipedia plus the ones on wiktionary, these ip bans are very important to keep the wikis free. free in the definition of "empty".

and yes: i dont know if the grease page is the appropriate page for my defacement here. instead of "to deface" there is a free spot on the wiktionary synonyms page of "to grease".— This unsigned comment was added by 46.223.1.175 (talk).

First of all, actions taken at any other wiki have no effect on Wiktionary, so there is no IP ban involved. If there were, you would not be able to post any message, and you obviously just posted here. The message you got is from an abuse filter. Abuse filters are automated tests that the system runs while processing your edit to look for spam and vandalism. The abuse filter in question was created years ago, so it has nothing to do with you. I won't go into details, but the reason you triggered the filter had to do with the unnecessary URL you included in your edit: if you had written [[w:Gog and Magog]] or {{w|Gog and Magog}} instead of https://en.wikipedia.org/wiki/Gog_and_Magog, your edit would have had no problems. It just happened to fit a pattern that is almost never found except in edits by spambots. Sorry for the trouble.

As for your question: the derivation from an Ancient Greek term that was presumably once in actual use for demagogues is simple and obvious, so why try to dig up cryptic biblical references to enemies in prophecies? Chuck Entz (talk) 01:25, 16 March 2019 (UTC)[reply]

Catfix not working anymore[edit]

The fixup of categories to add language tags and section links isn't working anymore on Category:Old Dutch lemmas. —Rua (mew) 19:28, 15 March 2019 (UTC)[reply]

@Rua: Fixed. — Eru·tuon 19:37, 15 March 2019 (UTC)[reply]

Requesting change to template "langname-mention"[edit]

Hi, can someone change {{langname-mention}} so that it displays only the name of the language when the third parameter is a hyphen? For some reason, if the third parameter is a hyphen, it treats it literally as a hyphen rather than an empty string, which isn't the behaviour of other templates. --Florian Blaschke (talk) 01:28, 16 March 2019 (UTC)[reply]

Why are you using this template? What purpose does it serve that {{cog}} (or {{noncog}}) would not be better for? —Μετάknowledge^{discuss/deeds} 02:54, 16 March 2019 (UTC)[reply]

It was created after a proposal of mine. It is for use in discussions, where {{cog}} or {{noncog}} are not appropriate. --Florian Blaschke (talk) 22:49, 19 March 2019 (UTC)[reply]

@Florian Blaschke: Done. — Eru·tuon 22:58, 19 March 2019 (UTC)[reply]

Thanks a bunch! --Florian Blaschke (talk) 23:08, 19 March 2019 (UTC)[reply]

CAT:E lots of 'em, xme no longer valid[edit]

Someone removed a language code causing lots of errors. Benwing2 (talk) 14:46, 16 March 2019 (UTC)[reply]

@Victar, it's better to do as much of the moving as possible before changing up a code so that you can avoid this. —Μετάknowledge^{discuss/deeds} 15:39, 16 March 2019 (UTC)[reply]

Huh, I thought I got most of theses. Thanks, I'll fix them now. --{{victar|talk}} 15:47, 16 March 2019 (UTC)[reply]

Spanish words with a dot[edit]

Hey all. Can someone make a list of all Spanish entries containing a dot? I have a feeling we have lots of incorrect abbreviations around here - S.T.D for example should be S. T. D.. We have Category:Spanish terms spelled with ., which is underpopulated and pretty useless. --I learned some phrases (talk) 08:44, 18 March 2019 (UTC)[reply]

User:I_learned_some_phrases/Spanish_Dots this work for you? Also, I hate when we have spaces in acronyms/abbreviations/initialisms. - TheDaveRoss 14:27, 20 March 2019 (UTC)[reply]

Hey Dave, that is just fine. If someone wants to add some to Category:Spanish terms spelled with ., they may. I doubt I'll do it. --I learned some phrases (talk) 17:00, 29 March 2019 (UTC)[reply]

Portuguese greenlinks[edit]

Plurals ending in -ais of adjectives ending in -al are using the template {{feminine singular of}} rather than {{plural of}}. Ultimateria (talk) 22:57, 19 March 2019 (UTC)[reply]

Colour table templates[edit]

In the Template:table:colors subpages, do we have a standard for dealing with multiple terms for one colour in a language? If not, let's pick one and document it. Looking at German for example (Template:table:colors/de) we have inconsistencies within the one template (some with a second colour in brackets, some with spaces between terms, some with commas between terms). -Stelio (talk) 14:10, 20 March 2019 (UTC)[reply]

[POLL] Further cleaning up form-of templates[edit]

Our handling of form-of templates is completely inconsistent. Some (the majority) use {{#invoke:form of|form_of_t}}, with either a capital or lowercase letter at the beginning of the template text (depending on the template) and no trailing period, while a significant minority use {{#invoke:form of|alt_form_of_t}} (formerly {{deftempboiler}}), with a capital letter at the beginning of the template text (unless |nocap=1 is used) and a trailing period (unless |nodot=1 or |dot= is used). For example, {{obsolete form of}} has a capital letter and period, while {{obsolete spelling of}} has a capital letter without a period and {{en-past of}} has a lowercase letter without a period. I argued above that all such templates should have a capital letter and period in English, but a lowercase letter without a period in other languages, for the following reason:

English is fundamentally different from foreign languages because the definitions of English terms are paraphrases using other English terms (normally full sentences or long phrases), while the definitions of foreign terms are English equivalent terms (often single words). For example, the definition of English umbrella is written as follows:

Cloth-covered frame used for protection against rain or sun.

while the definition of Portuguese guarda-chuva is written as follows:

umbrella

Note the difference in formatting. This is fairly consistent across Wiktionary and is the reason I remove final periods when they occur in Russian definitions. For this reason I think the use of capital letters and periods in templates like {{alternative form of}} should be different for English vs. foreign languages: Capital letter and period in English, lowercase letter and no period in foreign languages.

User:Sgconlaw and User:-sche agreed with me, User:Fay Freak appears to agree, while User:Rua disagrees. I'd like to take a poll to see what people think:

Leave everything the way it is, with all the inconsistencies.
Be consistent, using a trailing period and capital letter for English, and no trailing period or capital letter for other languages.
Be consistent, using a capital letter but no trailing period for English, and no trailing period or capital letter for other languages.
Be consistent, the same way across all languages, using a trailing period and capital letter everywhere.
Be consistent, the same way across all languages, using a capital letter but no trailing period everywhere.
Be consistent, the same way across all languages, using no trailing period or capital letter everywhere.

My personal feeling is I'd be fine with either #2 or #3 and probably OK with #6 as well; I feel more strongly about not having a trailing period or capital letter in foreign languages than the exact formatting in English. Benwing2 (talk) 21:29, 20 March 2019 (UTC)[reply]

BTW don't worry too much about technical issues involving changing the way templates handle trailing periods, I'm pretty sure I can figure out a way to do a bot run to automatically fix whatever we decide. Benwing2 (talk) 21:31, 20 March 2019 (UTC)[reply]

Consistency across languages makes more sense I think. Why would the exact same definition have one format in one language and another format in another language? Definitely 4-6. —Rua (mew) 21:38, 20 March 2019 (UTC)[reply]

#4. I don't mind the odd capital in non-Eng entries. There's also {{clipping of}} and co. – Jberkel 22:31, 20 March 2019 (UTC)[reply]

#4, with #2 as a backup if there's no consensus for #4. — SGconlaw (talk) 08:02, 21 March 2019 (UTC)[reply]
#4 is my preference too. -Stelio (talk) 08:52, 21 March 2019 (UTC)[reply]

Regarding the final dot, it must be kept in mind that it's much easier to add a dot than it is to remove it. Adding it means literally just typing a dot. Removing it means nodot=1, which is much longer. —Rua (mew) 23:52, 21 March 2019 (UTC)[reply]

I personally hate #4 in that IMO the trailing period looks very wrong in non-English entries, which are full of short glosses, not full sentences. If we are to settle for the same across all languages, I'd much prefer #6. Also keep in mind that, after looking through all the form-of templates, the majority of them don't have either a capital letter or period. I have also found the automatic capital letter/period very confusing and error-prone when trying to format entries properly with additional text. Benwing2 (talk) 20:06, 22 March 2019 (UTC)[reply]

I think 6 has my preference as well. —Rua (mew) 16:51, 29 March 2019 (UTC)[reply]

Templates/modules not minding what namespace they're in again[edit]

In e.g. this revision of Citations:recensent, the {{der}} and {{lb}} both put the page into categories. In the past, for at least a while, this was not the case and categories only got added to pages in certain namespaces. (I seem to recall seeing this issue pointed out twice before and it being fixed at least one of those times.) Is this something to be fixed, or would checking for namespace be too expensive Lua-memory-wise / would it be preferable to make checking categories for pages from unapproved namespaces a occasionally-recurring TODO task? - -sche (discuss) 05:29, 21 March 2019 (UTC)[reply]

I guess Module:etymology and Module:labels will have to be modified so that they don't add categories in the Citations namespace. They use Module:utilities to format categories and the set of namespaces in which Module:utilities adds categories includes Citations, but the same list doesn't work for all modules; {{citation}} needs to be able to add categories to the Citations namespace, but {{der}} and {{lb}} shouldn't. — Eru·tuon 05:45, 21 March 2019 (UTC)[reply]

Ah, right, I remember now that that was an impediment before. Well, as to {{der}} etc, it's not just the Citations: namespace that they shouldn't categorize in (or, alternatively, shouldn't be used in and would need to be cleaned up out of): they shouldn't categorize in most namespaces, it's only a small list of namespaces where they should categorize (mainspace, Reconstruction:, maybe Appendix:; where else?). - -sche (discuss) 07:14, 21 March 2019 (UTC)[reply]

Right, but there's no need to worry about any other namespaces because Module:utilities already disables categorization there. The only namespace in which categorization needs to be manually disabled (that is, in which Module:etymology and Module:labels should not call the format_categories function from Module:utilities) is Citations. (I suppose it would be more straightforward or understandable to send a custom list of allowed namespaces to Module:utilities, but that would require changes to the categorization function.) — Eru·tuon 07:39, 21 March 2019 (UTC)[reply]

Actually, maybe it would be easier to have to enable categorization in the Citations namespace than to disable it. Most likely most templates or modules want to avoid categorization in the Citations namespace. — Eru·tuon 08:06, 21 March 2019 (UTC)[reply]

New rare initial for Khmer is required[edit]

(Notifying Stephen G. Brown, Octahedron80): Hi. Khmer ហ្ស៉េន "zen" can't be transliterated. Getting the error "Lua error in Module:km-pron at line 269: Error handling initial ហស៉." . Can anyone with Lua skills, please add this new initial ហ + ស pronounced as /z/? It must be rare. It should follow the class of the initial consonant ហ. The whole term with diacritics should transliterate as "zeen", if using {{m|km|ហ្ស៉េន}}, for example. --Anatoli T. ^{(обсудить}/^вклад) 23:43, 21 March 2019 (UTC)[reply]

(Notifying Dixtosa, Kc kennylau, Rua, Ruakh, ZxxZxxZ, Erutuon, Jberkel): Calling Lua experts as well. --Anatoli T. ^{(обсудить}/^вклад) 23:44, 21 March 2019 (UTC)[reply]

I see ហស (class 1) and ហស៊ (class 2) already have /z/ in consonant table. Are you sure that ហស៉ is valid? (misspelling?) And which class is it? --Octahedron80 (talk) 23:57, 21 March 2019 (UTC)[reply]

@Octahedron80: Yes, interesting, thank you. ហ្សេន (zeen) already produces "zeen", which is already "a-series". It must be invalid then. I can't understand the module well. --Anatoli T. ^{(обсудить}/^вклад) 00:12, 22 March 2019 (UTC)[reply]

sort_key for Northern Tepehuan[edit]

Could someone add the following to the entry for Northern Tepehuan (ntp) at Module:languages/data3/n?

	sort_key = {
		from = {"á", "é", "í", "ó", "ú"},
		to   = {"a", "e", "i", "o", "u"}},

--Lvovmauro (talk) 05:30, 22 March 2019 (UTC)[reply]

Added. DTLHS (talk) 05:46, 22 March 2019 (UTC)[reply]

Substitutable params in documentation template data?[edit]

@Rua Do you (or does anyone) know how to transclude text inside of the <templatedata> tag that holds template data for documentation? It doesn't appear to work at all. I tried putting the start tag, end tag and contents in three different templates but that doesn't work either. This appears to mean that there's no way to templatize the template data and share it across the documentation of multiple similar templates (e.g. the form-of templates). If so, this really sucks as it means we have to copy the entire template data on each documentation page. (In general the template data appears badly thought out, e.g. why does it have to be visible on the doc page? Why can't it live on another page?) Benwing2 (talk) 01:46, 23 March 2019 (UTC)[reply]

The tag magic word works: {{#tag:templatedata|{{template with TemplateData tag contents}}}}. — Eru·tuon 02:26, 23 March 2019 (UTC)[reply]

@Erutuon Thanks! Benwing2 (talk) 12:34, 24 March 2019 (UTC)[reply]

@Erutuon Do you know if you can stick the template data inside of <includeonly>...</includeonly> tags and still have it work? That way it won't clutter up the visible documentation. Benwing2 (talk) 03:54, 25 March 2019 (UTC)[reply]

@Benwing2: That's beyond my knowledge. I haven't worked with TemplateData much. — Eru·tuon 04:36, 25 March 2019 (UTC)[reply]

@Erutuon OK, thanks. Benwing2 (talk) 04:48, 25 March 2019 (UTC)[reply]

Why can't I save my page?[edit]

Why can't I save my page? 194.228.13.62 13:16, 23 March 2019 (UTC)[reply]

I have tried to save my page, but table with

This action has been automatically identified as harmful. Unconstructive edits will be quickly reverted, and egregious or repeated unconstructive editing will result in your account or IP address being blocked. If you believe this action to be constructive, you may submit it again to confirm it.

A brief description of the abuse rule which your action matched is: probably vandalism. If you believe your edit was flagged in error, you may report it on the Wiktionary:Grease pit.

still shows. Why can't I JUST save my page and add it here? WHAT IS WRONG??? 194.228.13.62 13:17, 23 March 2019 (UTC)[reply]

We have abuse filters to prevent vandalism and badly formatted entries. They sometimes catch formatting errors. Every month or so such a filter catches an error in something I am trying to save. For me it is usually unmatched brackets. DCDuring (talk) 13:27, 23 March 2019 (UTC)[reply]

And are you able to save my page? I'm unable to add the text here, because it disallows me to do so. I can't even add the name here. 194.228.13.62 13:30, 23 March 2019 (UTC)[reply]

Spell it out - one letter at a time. SemperBlotto (talk) 14:23, 23 March 2019 (UTC)[reply]

@SemperBlotto you can see which edits by a particular user have triggered abuse filters here.

@194.228.13.62 it looks like your entry is flagged due to the combination of characters which form the word (i.e. no vowels mostly). I created the page (čtvrthrst) based on your first edit attempt, now that it exists I think you should be able to modify it if necessary. Sorry for the inconvenience and thanks for contributing despite the hurdles. - TheDaveRoss 15:46, 23 March 2019 (UTC)[reply]

Thanks alot! 194.228.13.62 15:48, 23 March 2019 (UTC)[reply]

Lol, our abuse filters blocked Czech for looking too much like gibberish. —Rua (mew) 16:02, 23 March 2019 (UTC)[reply]

Does anyone have a better solution for this ugly code?[edit]

In Module:fy-adjectives there are a few lines like this:

data.forms["pred|comd"] = data.forms["pred|comd"] or {}; table.insert(data.forms["pred|comd"], form)
data.forms["indef|n|s|comd"] = data.forms["indef|n|s|comd"] or {}; table.insert(data.forms["indef|n|s|comd"], form)

The idea is to first set the value to an empty table if it's currently nil, then append a new value to either that empty table or the existing one, depending on which was the case. It's rather ugly code but I can't think of something elegant at the moment. @Erutuon, Benwing2 Does anyone have an idea for nicer code? —Rua (mew) 21:37, 23 March 2019 (UTC)[reply]

You can use Module:auto-subtable. — Eru·tuon 21:46, 23 March 2019 (UTC)[reply]

Thank you, I figured someone would have made a nice solution already. Is there a reason why the metatable is removed at the end? —Rua (mew) 21:51, 23 March 2019 (UTC)[reply]

Removing before make_table is probably necessary, because the code in make_table is checking whether fields are nil (local forms = data.forms[param]; if not forms then ... end). And I've run into weird bugs when the metatable is left on the table, so I figure it's generally good to remove it by default after the code that actually needs it. — Eru·tuon 22:03, 23 March 2019 (UTC)[reply]

edit request: MediaWiki:Common.css[edit]

These are my proposed changes. They mostly have to do with CJK and consistency, although I also did minor cleanup elsewhere. —Suzukaze-c ◇◇ 22:14, 23 March 2019 (UTC)[reply]

@Suzukaze-c: I looked at it, but was kind of overwhelmed by the volume of changes. It would be easier to merge changes bit by bit. I could start with the style changes at least, but I don't understand all of the CJKV stuff. — Eru·tuon 03:52, 27 March 2019 (UTC)[reply]

@Erutuon: Thanks. You could look at the history of User:Suzukaze-c/common.css, although it might still be unclear. I'd be glad to answer any questions. —Suzukaze-c ◇◇ 05:56, 27 March 2019 (UTC)[reply]

@Erutuon: Could I bother you to review and add the rest? —Suzukaze-c ◇◇ 04:54, 3 April 2019 (UTC)[reply]

@Suzukaze-c: Thanks for bothering me. I've merged some more changes, and now I'm coming to some of the CJKV changes that I'm not sure about. Like, do other people agree that bolding should be re-enabled? — Eru·tuon 19:09, 3 April 2019 (UTC)[reply]

In regards to bolding, admittedly it has not been discussed. However, I've seen one fellow "fix" .Jpan using personal JS (while I was stalking Recent changes), and it has also been brought up here. I also thought of diff. I would really like to know the original reason why bolding was disabled and replaced with enlargement of text; the comment claims that "fonts are really big in Japan", but I haven't seen this practice anywhere else. In traditional Japanese typography, "normal" text is typeset in a serif font, while "bold" text is set in a heavier sans-serif font (w:ja:太字#日本語における太字), but that wouldn't really work out here, when the body uses a sans-serif font. Alternatively, we might use 圏点 or underlining. Perhaps opening a more open discussion first would indeed be more appropriate.

—Suzukaze-c ◇◇ 19:56, 3 April 2019 (UTC)[reply]

@Suzukaze-c: I think the original reason why bolding was disabled was due to MS PGothic having only one weight. Now we've eliminated that font, we can restore the normal bolding. --Dine2016 (talk) 03:18, 4 April 2019 (UTC)[reply]

Maybe, but it was done for Chinese as well. —Suzukaze-c ◇◇ 03:35, 4 April 2019 (UTC)[reply]

Bot Task- Remove All Empty Parameters[edit]

Every time a template gets converted to use Module:parameters, CAT:E fills up with {Fill-in-theblank} is not used by this template" errors for empty named parameters- the parameter name followed by "=", and nothing else. I'm assuming these are there in the first place because people are either substing templates that have all the parameters present so they can be filled in by hand, or people are copypasting the empty template into entries and customizing it for each entry. These parameters do nothing and consequently are ignored by both the system and the editors.

This seems to me like a wholly preventable problem. It should be fairly simple to remove all of these by bot without any ill effects. It would no doubt be time-consuming up front, but would save as much or more time that would otherwise be spent cleaning up module errors later on- errors that temporarily break the entries and make a bad impression on site visitors until they're fixed.

A possible second step would be to add an abuse filter once they're gone to remind contributors not to leave empty parameters (I can see how this might interact poorly with the way preload templates are used for new entries, though).

Anyone interested? Chuck Entz (talk) 16:11, 24 March 2019 (UTC)[reply]

First of all, using parameters in a template should not throw an error, there is no reason for that not to fail gracefully from a presentation perspective. Categorizing them for cleanup is one thing, having a big, red error message show in place of what might otherwise be a perfectly acceptable template output is just silly. We should design modules to ignore unexpected parameters instead, and only give errors when parameters are missing or of the wrong type.

Second, I think your idea to remove blank labeled parameters (and empty parameters when applicable) is sound, as long as nobody can provide instances when they are beneficial.

Third, sure I would be happy to run a bot to do some cleanup once the specifics are nailed down. - TheDaveRoss 16:40, 24 March 2019 (UTC)[reply]

@Chuck Entz My apologies, in this case the error messages about dot= arose as follows: (1) form_of_t in Module:form of was allowing but ignoring 'dot'; (2) I created a tracking category to find all uses of 'dot'; (3) I cleaned them up; (4) I disallowed 'dot'; (5) some errors ensued because the tracking category I created failed to track cases with empty dot=. As these are only a few, I've been cleaning them up as they come, but if you want I'll temporarily add 'dot' back as an allowed param and reenable the tracking for empty dot=, and then clean them up over several weeks as the site gradually reprocesses all the pages. Benwing2 (talk) 18:27, 24 March 2019 (UTC)[reply]

@Chuck Entz I went ahead and added 'dot' back and fixed tracking for empty dot=. Benwing2 (talk) 02:19, 25 March 2019 (UTC)[reply]

@Benwing2: That might take a while. I noticed that all templates invoking Module:form of seem to end in of, so I went through the dump and gathered instances of templates ending in of, gathered the templates that include |dot=, and created this list, if you can use it for a bot run. — Eru·tuon 03:59, 25 March 2019 (UTC)[reply]

@Erutuon Thanks, this is helpful. Most of these are using the alt_form_of_t form-of templates, which do support dot= and by default add a final period, but I have the full list of both types of templates, so I can sort things out. Benwing2 (talk) 04:08, 25 March 2019 (UTC)[reply]

@Benwing2: Whoops, the list may be missing a few entries because I didn't trim the template names. I've added trimming of ASCII whitespace and am regenerating the list. Not sure if I need to handle Unicode whitespace too, which would be more complicated. [Edit: Never mind, none were missing.] — Eru·tuon 04:38, 25 March 2019 (UTC)[reply]

@Erutuon OK, cool. Benwing2 (talk) 04:49, 25 March 2019 (UTC)[reply]

@TheDaveRoss I disagree that we should just ignore unrecognized params, vs. throwing an error on unrecognized params, because these unrecognized params generally represent mistakes (particularly typos). If we display an error, the person who made the error will generally see it and fix it; otherwise they won't know they made a mistake, and the error may never be fixed, even if it sits in a silent tracking category. As an example, the *fix templates used to ignore and silently track cases where the user accidentally wrote 't', 'tr', etc. (in place of 't1'/'t2'/'t3', ..., 'tr1'/'tr2'/'tr3'/etc.). Before I turned on the code to make these cases an error, I went through and manually emptied the categories, which was a lot of work because there were over 1,000 such cases. This could not be done by bot because there was no way to automatically determine which of 't1', 't2', 't3', etc. was correct. By throwing an error, we ensure that such incorrect uses don't accumulate. Benwing2 (talk) 18:34, 24 March 2019 (UTC)[reply]

@TheDaveRoss BTW you need to be careful about blithely removing all empty named params because there may be templates where such params are significant. For example, the various templates that formerly used {{deftempboiler}} used to recognize empty dot= to indicate that the default trailing period should be suppressed. When I converted these templates to use alt_form_of_t in Module:form of, I wrote a bot script to find all uses of empty dot= and convert them to nodot=1 (because having empty params be significant is very fragile and generally a bad idea); in this case removing them would have been incorrect. In general there's no way to know whether a given empty or blank param is significant except by analyzing the template or Lua code that implements the template in question. Benwing2 (talk) 18:39, 24 March 2019 (UTC)[reply]

@Benwing2: I agree with your second point, which is what my third point was about. There will be times when it is not appropriate to remove them and we need to be conscious of that. Re your first point, having a big red error message in the middle of an entry (or many entries) is not a particularly elegant means to alert folks to their errors, especially when the error doesn't actually prevent the correct outcome from being presented. If I stuck a stanza= into {{quote-song}} but otherwise populated the necessary parameters correctly that needn't prevent the quote from displaying at all. In fact some (possibly) useful information might be included. It is a minor point, but I don't like showing errors in production environments unless they are unavoidable. - TheDaveRoss 22:06, 24 March 2019 (UTC)[reply]

@TheDaveRoss: Yeah, I understand your point, I just don't see any other practical way in general of alerting people that they made a mistake (since they're almost always in fact mistakes, usually a misspelling of a parameter). Having an error on a lot of pages is definitely a bad idea but usually it's only a few, and they get fixed quickly. Benwing2 (talk) 22:50, 24 March 2019 (UTC)[reply]

@Benwing2: I suppose, ideally, the error would show on the preview and be suppressed once the page is saved. I am not sure if Modules can tell if they are in a preview context or not, perhaps some clever use of subst: could accomplish this. - TheDaveRoss 23:07, 24 March 2019 (UTC)[reply]

Module:Check for unknown parameters on Wikipedia uses frame:preprocess( "{{REVISIONID}}" ) == "" to determine if it is in preview mode and generate different text, but it's maybe simpler to generate the same text but use the previewonly class to hide it, as {{IPA}}, {{Q}}, {{ar-root}}, and {{grc-IPA}} do. — Eru·tuon 23:24, 24 March 2019 (UTC)[reply]

It's a good idea to track the unrecognized parameters so that they can be cleaned up when a template is first converted to use Module:parameters. The module errors can be turned on when no instances have unrecognized parameters. To find unrecognized parameters, it's not reliable to use a tracking template when lots of widely transcluded modules are being edited: pages will update slowly and "what links here" page for the tracking template will take a while to fill up. (That's been happening lately because of all of Benwing2's work. I think it would be a good idea to make fewer edits to widely transcluded modules, by using sandbox modules for instance. That's the practice often taken on Wikipedia.) Dump analysis is a more reliable method. I do have a program to grab all instances of a template for easier analysis, if that would be useful. — Eru·tuon 20:01, 24 March 2019 (UTC)[reply]

@Erutuon Generally I do use userspace modules. Sometimes I've not but I'll be better about that in the future. As for dump analysis, that is a good idea; I've been doing all my work using refs and categories but it looks like the dump is only 600-800 MB compressed (I expected more like 10+ GB based on working with Wikipedia dumps). BTW do you have experience working with Toolforge? If so any comments as to how difficult it is to set up, how reliable, etc.? Benwing2 (talk) 21:38, 24 March 2019 (UTC)[reply]

@Benwing2: Yeah, the Wiktionary dump is a lot smaller than the Wikipedia one. At one point I considered searching the Wikipedia dump for something or other, but gave up when I saw how huge it was.

I haven't used Toolforge, but Dixtosa created a tool for finding entries with a particular suffix so maybe he would have some comments on it. After the recent discussion about replacing {{redlink category}} with a toolserver, I've been thinking seriously about trying to devise a toolserver that uses one or more of my dump analysis programs, for instance one that would allow people to get or search all instances of a particular template, or to find all instances that use a particular parameter. That could be complicated, though, and I don't have any experience with web servers. — Eru·tuon 22:21, 24 March 2019 (UTC)[reply]

@Erutuon I did my dissertation work partly on Wikipedia; the dump was around 9 GB bz2-compressed at the time, maybe 30 GB uncompressed, and I imagine it's grown significantly since then. Processing it took awhile, and doing things like sorting the entries wasn't easy on a 16 GB laptop, since it couldn't fit all in memory. Benwing2 (talk) 22:34, 24 March 2019 (UTC)[reply]

The latest dump is 6.34 GB uncompressed. I don't load it all at once into memory but I use an XML iterator which is fast enough for my purposes. DTLHS (talk) 23:31, 24 March 2019 (UTC)[reply]

The all-revisions version is big, the current-revisions version is smaller. In other good news, the toolserver has access to replica databases, so you don't have to parse xml dumps you can just query the actual MW database structure for the vast majority of tasks. They are also much more up-to-date. - TheDaveRoss 23:46, 24 March 2019 (UTC)[reply]

Haha, no I'm not going to to 5 million queries for each report I want to run. The things I actually care about are most definitely not captured in the "MW database structure". DTLHS (talk) 23:51, 24 March 2019 (UTC)[reply]

For sure it is not useful in every case, but in this case (finding empty parameters for particular template calls) a single regex query could easily return a list of all pages which match. Further you can create tables on toolserver, so things like capturing a complete list of all titles by language can be done once and then leveraged moving forward, etc. - TheDaveRoss 00:07, 25 March 2019 (UTC)[reply]

Tensification of Korean voiceless consonants[edit]

{{ko-IPA}} takes the parameter “com=“ to add gemination/tensification of certain consonants but they seem to only work with voiced consonant letters, such as ㄱ, not with their voiceless counterparts, such as ㅋ. I need to make 카페 (kape) respelled as “까페” with |com=0 without the actual respelling. (Notifying TAKASUGI Shinji, HappyMidnight): Does it sound reasonable? Can it be done? --Anatoli T. ^{(обсудить}/^вклад) 09:43, 26 March 2019 (UTC)[reply]

카페 and 까페 are pronounced and written differently. Just try googling it. There is no phonological process to turn ㅋ into ㄲ. — TAKASUGI Shinji (talk) 09:50, 26 March 2019 (UTC)[reply]

@TAKASUGI Shinji: Is it so rare? 커피 (keopi) seems to be also pronounced as "꺼피". What do you suggest? Just a complete respelling, as I did it in diff? Re: googling - I can't read well in Korean and it's very little on this topic in English. --Anatoli T. ^{(обсудить}/^вклад) 13:00, 26 March 2019 (UTC)[reply]

It is a different way of pronouncing foreign words rather than a phonological process. You can find 꺼피 on Google. The accepted mismatch (unofficial though) is pronouncing ㅅ as ㅆ, which we have talked about in User talk:TAKASUGI Shinji/2015#서클. — TAKASUGI Shinji (talk) 00:57, 27 March 2019 (UTC)[reply]

Proto-Chatino[edit]

I'd like to request that a language code be added for Proto-Chatino. Chatino already has the family code omq-cha, so the protolanguage should be omq-cha-pro --Lvovmauro (talk) 07:55, 28 March 2019 (UTC)[reply]

@Lvovmauro Added. DTLHS (talk) 19:11, 3 April 2019 (UTC)[reply]

Text substitution for Western Higland Chatino[edit]

Could someone add the following to the entry for Western Highland Chatino (ctp) at Module:languages/data3/c? (I hope I have the formatting right.)

entry_name = {
	from = {"[¹²³⁴⁵]"},
	to   = {}},
sort_key = {
	from = {"á", "é", "í", "ó", "ú"},
	to   = {"a", "e", "i", "o", "u"}},

--Lvovmauro (talk) 09:00, 28 March 2019 (UTC)[reply]

Added. DTLHS (talk) 15:43, 28 March 2019 (UTC)[reply]

Cleaning up form-of template shortcuts[edit]

In the process of cleaning up the form-of templates, I noticed that the current situation w.r.t. shortcuts and aliases of the templates is an utter mess. Some issues:

Total inconsistency in the way the shortcuts are named.
Most of the shortcuts and alternative names aren't even documented.
Misleading or confusing names: e.g. (1) {{clip}} is a bad shortcut for {{clipping of}} because {{clipping}} also exists and is different (it is used in etymology sections instead of definitions); how would we know which one is being shortened? (Below I suggest {{clipof}} instead.) (2) {{abbreviation}} is a bad alias for {{abbreviation of}} because the example of {{clipping of}} vs. {{clipping}} suggests that {{abbreviation}} should be the etymology-section equivalent of {{abbreviation of}}, rather than just an alias.
Incorrect names: {{neuter of}} is an alias for {{neuter singular of}} but is an incorrect name because the template displays "neuter singular", and plain "neuter" isn't the same as "neuter singular" (e.g. we distinguish {{genitive of}} vs. {{genitive singular of}}). Similarly for {{feminine past participle of}}, an alias of {{feminine singular past participle of}}. Also, {{superseded form of}} is an alias of {{superseded spelling of}} but everywhere else we distinguish "foo form of" from "foo spelling of".
Unnecessary shortcuts: e.g. {{shortfor}} is one character shorter than {{short for}}. Why bother?
Aliases that are longer than the original long-form name, e.g. {{common misspelling of}} as an alias of {{misspelling of}}.

In general I think we ought to have one canonical long-form template name, and one single shortcut, with consistent shortcut naming. Having multiple aliases around is a maintenance headache, and IMO doesn't help editors if the names are near each other in the alphabet, as the correct names can easily be located using auto-complete.

The convention I've followed below for shortcut naming is if the long form is Template:Foobar of with a single word preceding "of", the short form will be Template:fooof; but if the long form is Template:Foobar bazbat of with multiple words preceding "of", the short form will be Template:foobaz. Furthermore I try to keep the shortcut no more than 6 or so characters, if possible.

In the table below I include all the non-language-specific form-of templates with aliases, and list all the canonical names (boldfaced) and the aliases, along with the corresponding canonical template, the number of uses and the suggested disposition.

In general I've followed the following principle when proposing to eliminate a template alias: If it has < 1000 uses, just delete it once it's orphaned; but if it has >= 1000 uses, deprecate it the way we've deprecated high-use templates like {{context}}. The threshold for deprecation is debatable, and maybe 1000 is too low. Of the five aliases below that are above the threshold, three ({{alt form of}} with 1136 uses, {{alternate spelling of}} with 1627 uses, {{plural form of}} with 1338 uses) are just above the threshold and could maybe be deleted without deprecation. The other two ({{alt form}} with 19464 uses and {{conjugation of}} with 24858 uses) are far above it and should be deprecated, not deleted; or alternatively, keep them as special cases since there are so many uses.

Aliased template	Canonical template	#Uses	Suggested disposition
Template:abbreviation of	Template:abbreviation of	6235	Keep
Template:abb	Template:abbreviation of	24	Rename to Template:abbr of then delete
Template:abbreviation	Template:abbreviation of	17	Rename to Template:abbr of then delete
Template:ao	Template:abbreviation of	114	Rename to Template:abbr of then delete
Template:alternative case form of	Template:alternative case form of	1378	Keep
Template:alternative capitalisation of	Template:alternative case form of	19	Rename to Template:alternative case form of then delete
Template:alternative capitalization of	Template:alternative case form of	73	Rename to Template:alternative case form of then delete
Template:altcaps	Template:alternative case form of	99	Rename to Template:alt case then delete
Template:altcase	Template:alternative case form of	7	Rename to Template:alt case then delete
Template:alternative form of	Template:alternative form of	80654	Keep
Template:alternate form of	Template:alternative form of	444	Rename to Template:alternative form of then delete
Template:alt form	Template:alternative form of	19464	Keep
Template:altform	Template:alternative form of	2355	Rename to Template:alt form then deprecate
Template:alt form of	Template:alternative form of	1136	Rename to Template:alt form then deprecate
Template:alt-form	Template:alternative form of	305	Rename to Template:alt form then delete
Template:alternative spelling of	Template:alternative spelling of	25283	Keep
Template:alternate spelling of	Template:alternative spelling of	1627	Rename to Template:alternative spelling of then deprecate
Template:altspelling	Template:alternative spelling of	158	Rename to Template:alt sp then delete
Template:altspell	Template:alternative spelling of	773	Rename to Template:alt sp then delete
Template:alt-sp	Template:alternative spelling of	415	Rename to Template:alt sp then delete
Template:alt spell of	Template:alternative spelling of	4	Rename to Template:alt sp then delete
Template:attributive form of	Template:attributive form of	707	Keep
Template:attributive of	Template:attributive form of	605	Rename to Template:attributive form of then delete
Template:clipping of	Template:clipping of	1061	Keep
Template:clipped form of	Template:clipping of	0	Delete
Template:clip	Template:clipping of	122	Rename to Template:clip of then delete
Template:comparative of	Template:comparative of	8054	Keep
Template:comparative form of	Template:comparative of	51	Rename to Template:comparative of then delete
Template:eclipsis of	Template:eclipsis of	773	Keep
Template:eclipsed	Template:eclipsis of	0	Delete
Template:eggcorn of	Template:eggcorn of	28	Keep
Template:eggcorn	Template:eggcorn of	7	Rename to Template:eggcorn of then delete
Template:ellipsis of	Template:ellipsis of	189	Keep
Template:anapodoton of	Template:ellipsis of	0	Delete
Template:ellipse of	Template:ellipsis of	1	Rename to Template:ellipsis of then delete
Template:feminine singular past participle of	Template:feminine singular past participle of	7294	Keep
Template:feminine past participle of	Template:feminine singular past participle of	828	Rename to Template:feminine singular past participle of then delete
Template:honorific alternative case form of	Template:honorific alternative case form of	18	Keep
Template:honoraltcaps	Template:honorific alternative case form of	18	Rename to Template:honor alt case then delete
Template:inflection of	Template:inflection of	1912109	Keep
Template:conjugation of	Template:inflection of	24858	Rename to Template:inflection of then deprecate
Template:initialism of	Template:initialism of	6761	Keep
Template:io	Template:initialism of	274	Rename to Template:init of then delete
Template:lenition of	Template:lenition of	1014	Keep
Template:lenited	Template:lenition of	0	Delete
Template:men's speech form of	Template:men's speech form of	3	Keep
Template:men's form of	Template:men's speech form of	1	Rename to Template:men's speech form of then delete
Template:misspelling of	Template:misspelling of	5782	Keep
Template:common misspelling of	Template:misspelling of	40	Rename to Template:misspelling of then delete
Template:misspell	Template:misspelling of	6	Rename to Template:missp then delete
Template:neuter singular of	Template:neuter singular of	1070	Keep
Template:neuter of	Template:neuter singular of	394	Rename to Template:neuter singular of then delete
Template:neuter singular past participle of	Template:neuter singular past participle of	356	Keep
Template:neuter past participle of	Template:neuter singular past participle of	0	Delete
Template:obsolete spelling of	Template:obsolete spelling of	6256	Keep
Template:obssp	Template:obsolete spelling of	2	Rename to Template:obs sp then delete
Template:obs-sp	Template:obsolete spelling of	9	Rename to Template:obs sp then delete
Template:passive of	Template:passive of	2492	Keep
Template:passive form of	Template:passive of	525	Rename to Template:passive of then delete
Template:passive past tense of	Template:passive past tense of	89	Keep
Template:past passive of	Template:passive past tense of	10	Rename to Template:passive past tense of then delete
Template:passive past of	Template:passive past tense of	4	Rename to Template:passive past tense of then delete
Template:past participle of	Template:past participle of	22547	Keep
Template:past participle	Template:past participle of	127	Rename to Template:past participle of then delete
Template:past tense of	Template:past tense of	874	Keep
Template:past of	Template:past tense of	56	Rename to Template:past tense of then delete
Template:plural definite of	Template:plural definite of	5103	Keep
Template:definite plural of	Template:plural definite of	30	Rename to Template:plural definite of then delete (or keep since it begins with a different letter?)
Template:plural indefinite of	Template:plural indefinite of	4793	Keep
Template:indefinite plural of	Template:plural indefinite of	197	Rename to Template:plural indefinite of then delete (or keep since it begins with a different letter?)
Template:plural of	Template:plural of	423977	Keep
Template:plural form of	Template:plural of	1338	Rename to Template:plural of then deprecate
Template:present tense of	Template:present tense of	2431	Keep
Template:present of	Template:present tense of	28	Rename to Template:present tense of then delete
Template:rare form of	Template:rare form of	350	Keep
Template:rareform	Template:rare form of	20	Rename to Template:rare form of then delete
Template:rare spelling of	Template:rare spelling of	584	Keep
Template:rarespell	Template:rare spelling of	82	Rename to Template:rare sp then delete
Template:rarspell	Template:rare spelling of	1	Rename to Template:rare sp then delete
Template:short for	Template:short for	1041	Keep
Template:short form of	Template:short for	133	Rename to Template:short for then delete
Template:short of	Template:short for	15	Rename to Template:short for then delete
Template:shortfor	Template:short for	2	Rename to Template:short for then delete
Template:singular definite of	Template:singular definite of	5240	Keep
Template:definite singular of	Template:singular definite of	598	Rename to Template:singular definite of then delete (or keep since it begins with a different letter?)
Template:standard spelling of	Template:standard spelling of	6	Keep
Template:standspell	Template:standard spelling of	6	Rename to Template:stand sp then delete
Template:substantivisation of	Template:substantivisation of	1	Replace with `{{form of\|...\|substantivization}}` then delete
Template:substantivization of	Template:substantivisation of	0	Delete
Template:superlative of	Template:superlative of	8126	Keep
Template:superlative form of	Template:superlative of	55	Rename to Template:superlative of then delete
Template:superseded spelling of	Template:superseded spelling of	787	Keep
Template:deprecated spelling of	Template:superseded spelling of	0	Delete
Template:superseded form of	Template:superseded spelling of	10	Rename to Template:superseded spelling of then delete
Template:synonym of	Template:synonym of	12661	Keep
Template:alternative term for	Template:synonym of	127	Rename to Template:synonym of then delete
Template:altname	Template:synonym of	164	Rename to Template:syn of then delete
Template:synonym	Template:synonym of	41	Rename to Template:synonym of then delete
Template:alternative name of	Template:synonym of	18	Rename to Template:synonym of then delete
Template:synof	Template:synonym of	598	Rename to Template:syn of then delete
Template:syn-of	Template:synonym of	1	Rename to Template:syn of then delete
Template:syn of	Template:synonym of	7	Keep

A few debatable issues:

Of the three templates I mention above with #uses a bit over 1000 ({{alt form of}} with 1136 uses, {{alternate spelling of}} with 1627 uses, {{plural form of}} with 1338 uses) that I'd like to eliminate, should we deprecate or delete them once they're orphaned?
Should we deprecate {{alt form}} (19464 uses) -> {{altform}}, or keep as a special case due to the high number of uses?
Similarly should we deprecate {{conjugation of}} (24858 uses) -> {{inflection of}}, or keep as a special case due to the high number of uses?
Should we orphan and delete {{definite plural of}} (30 uses) -> {{plural definite of}}, or keep because it begins with a different letter and might not be so easy to locate with autocompletion? Similarly for {{definite singular of}} and {{indefinite plural of}}?

Benwing2 (talk) 03:53, 29 March 2019 (UTC)[reply]

Wow, great job. Just a small comment: I feel that {{abbr of}}, {{syn of}}, etc., rather than {{abbrof}}, {{synof}}, etc., would be clearer. — SGconlaw (talk) 04:06, 29 March 2019 (UTC)[reply]

I agree with SGConlaw, this is a great effort, I can get on board with it, and I think spaces within the abbreviated forms would be preferable. - TheDaveRoss 13:05, 29 March 2019 (UTC)[reply]

I think {{conjugation of}} should go. It was once different from {{inflection of}}, but they've been aliases for years now.

Some of the other ones you listed are also pretty much special cases for {{inflection of}}, if categorization is not considered:

{{comparative of}} → {{inflection of|||comd}}
{{feminine singular past participle of}} → {{inflection of|||f|s|past|ptcp}}
{{neuter singular of}} → {{inflection of|||n|s}}
{{passive of}} → {{inflection of|||pasv}}
{{passive past tense of}} → {{inflection of|||pasv|past}}
{{past participle of}} → {{inflection of|||past|ptcp}}
{{past tense of}} → {{inflection of|||past}}
{{plural definite of}} → {{inflection of|||p|def}}
{{plural indefinite of}} → {{inflection of|||p|indef}}
{{plural of}} → {{inflection of|||p}}
{{present tense of}} → {{inflection of|||pres}}
{{singular definite of}} → {{inflection of|||s|def}}
{{superlative of}} → {{inflection of|||supd}}

It's more maintainable if we have one template to specify inflections, rather than one general one and a bunch of specific cases that the general one also covers. The length or difficulty in typing it is pretty much moot because inflected forms are generally created through WT:ACCEL or with a bot anyway. It's only the ones that aren't inflections, but are added manually by editors, that need shortcuts. As for the categories, I'm of the opinion that inflection-of templates should never categorise, that is the task of {{head}}, and should be done according to the needs of the language. I remember when {{plural of}} once categorised, and it was awful. So even if we don't deprecate and replace them with {{inflection of}}, then we should still at least move towards removing the categories from them.

As for {{e-form of}}, I think it should be renamed. It's only used for Danish, so it should be {{da-e-form of}}. At the same time, it's really just another specialised inflection template, so it could also just be replaced with {{inflection of|||p|and|def|s|attr}}. —Rua (mew) 17:07, 29 March 2019 (UTC)[reply]

@Rua Overall I agree with you, and any issues with length could be made better by creating an alias {{infl of}} or {{inflof}}. But others may disagree and argue that the most common/basic inflectional form-of templates should stay. I also agree with renaming {{e-form of}}; I'll go ahead and do that if no one objects. What do you think however of the idea of making the renamings I present above? Should the short forms look like {{abbr of}} or {{abbrof}}? And for the ones that don't end in "of", should we have {{alt form}} or {{altform}}, and {{alt sp}} or {{altsp}}? Benwing2 (talk) 01:24, 30 March 2019 (UTC)[reply]

My preference is for the versions with spaces, that is, {{abbr of}} rather than {{abbrof}}. I think the former are more understandable. Otherwise, I have no objection to the proposed renamings. Thanks. — SGconlaw (talk) 01:28, 30 March 2019 (UTC)[reply]

I like spaces in the names too. — Eru·tuon 01:32, 30 March 2019 (UTC)[reply]

I am for removing all the names not containing “of” and prefer having a space before every “of”. Hence I think there should not be {{altcase}} but {{altcase of}}, and {{alt form}} and {{altform}} should not exist but {{altform of}}, so also {{altspell of}}, whereas for the parallelism the shortcut for {{obsolete spelling of}} should be {{obsspell of}}, right, and so {{misspell of}}, {{rarspell of}}, and I think I have said enough that you get my idea.

Furthermore I accede to the view that all the inflection-of templates should never categorize, and I assent to the said merges into {{inflection of}}. Fay Freak (talk) 02:24, 30 March 2019 (UTC)[reply]

@Fay Freak My only issue is that {{misspell of}} is kind of long for a shortcut. Benwing2 (talk) 03:15, 30 March 2019 (UTC)[reply]

@Erutuon, Sgconlaw So you'd prefer {{alt form}} over {{altform}}, {{rare sp}} over {{raresp}}, etc. Benwing2 (talk) 03:16, 30 March 2019 (UTC)[reply]

@Rua See Category:Form-of templates. At the top is a table of all the non-language-specific form-of templates that I know of, along with their properties, including which (if any) category the page is added to. Note that most of the inflection templates don't categorize; the primary exception is the participle templates. In this case, this is arguably correct; the headword is likely to just categorize into "LANG participles", and to get something more specific you might need the definition line. Benwing2 (talk) 04:35, 30 March 2019 (UTC)[reply]

@Rua I renamed {{e-form of}} to {{da-e-form of}} and made it assume |lang=da automatically. Benwing2 (talk) 05:04, 30 March 2019 (UTC)[reply]

@Rua You can see from Category:Form-of templates how utterly random the punctuation of the various templates is: Obsolete form of FOO. vs. Obsolete spelling of FOO vs. obsolete typography of FOO etc. Benwing2 (talk) 05:43, 30 March 2019 (UTC)[reply]

Yes. I'm more used to {{alt form of}}, but I'll get used to whatever is chosen. — Eru·tuon 19:03, 31 March 2019 (UTC)[reply]

Cleaning up lang-specific form-of template shortcuts[edit]

There are hundreds of lang-specific form-of templates. Most of them are really junky and many should be deleted. For the moment I'll just bring up one template that is misnamed and has several shortcuts:

Aliased template	Canonical template	#Uses	Suggested disposition
Template:kyūjitai spelling of	Template:kyūjitai spelling of	569	Rename to Template:ja-kyujitai spelling of then delete
Template:kyujitai spelling of	Template:kyūjitai spelling of	415	Rename to Template:ja-kyujitai spelling of then delete
Template:kyujitai of	Template:kyūjitai spelling of	66	Rename to Template:ja-kyusp (or Template:ja-kyu sp? or Template:ja-kyu? or Template:ja-kyu of?) then delete
Template:kyujitai	Template:kyūjitai spelling of	29	Rename to Template:ja-kyusp (or Template:ja-kyu sp? or Template:ja-kyu? or Template:ja-kyu of?) then delete
Template:kyu	Template:kyūjitai spelling of	41	Rename to Template:ja-kyusp (or Template:ja-kyu sp? or Template:ja-kyu? or Template:ja-kyu of?) then delete

Here, I'm operating under the following principles:

All language-specific templates should be prefixed by the language code.
Macrons and other diacritics should be avoided in template names.
The name of the short form is pending agreement on the general principles for naming these. Suggestions welcome.

Benwing2 (talk) 05:37, 30 March 2019 (UTC)[reply]

There is no principle to avoid macrons and diacritics in template names. The name is chosen according to what is the most readable or iconic, hence all diacritic distinctions, and macron-characters aren’t even hard to type – people should be gently pushed to use decent keyboard layouts. Particularly it does not unsettle Japanese editors who have to use IMEs anyway. It is all okay if there are redirects without the “special” characters. Fay Freak (talk) 16:43, 30 March 2019 (UTC)[reply]

Rename to Template:ja-kyūjitai spelling of / Template:ja-kyujitai spelling of / Template:ja-kyuujitai spelling of. —Suzukaze-c ◇◇ 21:30, 30 March 2019 (UTC)[reply]

While I have no problem typing macrons, it's not so easy for others and requiring editors to use particular keyboard layouts is way beyond Wiktionary's scope. So my preference is for forms without macrons. That said, I did create {{R:Álgu}} with an accent, so I suppose I'm not entirely consistent in that myself. Perhaps the macronless version could be a redirect. —Rua (mew) 21:41, 30 March 2019 (UTC)[reply]

Hence I said, the macronless versions can be redirects. The editors are not required then to use particular keyboard layouts. Whereas editors who are used to write the “correctly written” names are not detracted. The correctly written names have to have a certain acknowledgement. Of course Löw’s books are {{R:arc:Löw-Flora}} etc. “Low” and “Loew” aren’t his names, are other people’s names: So if I look into Cat:Aramaic reference templates I recognize the name “Löw”, while the same would not be true for “Loew”. These aren’t even redirects because diaereses are not macrons. Where we see “macrons” they generally have the tinge of something “optional”, for in the original intentions of the originators of the Latin alphabet, lengths in a language having the vowels a,e,i,o,u, as Japanese are not designated, while the same optionality is not there in the relation “ö” to “o”, perhaps also not there for Hungarian – well then I don’t know whether Hungarian templates have to have diacriticless redirects but at least I know that they are correctly added under the diacritic names. So we find the names with diacritics in Cat:Hungarian reference templates – it’s perfectly natural and it would affront people to say otherwise “macrons and other diacritics should be avoided in template names.” It’s not specific to reference templates, if somebody pulls that argument, other templates are considered the same way. “kyūjitai” is not “kyujitai”. Fay Freak (talk) 23:23, 30 March 2019 (UTC)[reply]

I would really prefer to avoid macrons when possible. Note for example we have Template:vi-Nom form of and Template:han tu form of, not Template:vi-Nôm form of and Template:hán tự form of. There is no ambiguity when "kyujitai" is written instead of "kyūjitai". There's also a difference between transcriptions, which are always approximate and variable, and terms written in the original script, which is the case with Löw and Álgu. I also think there's a difference between R: templates with people's names, and templates with foreign terms in them. Benwing2 (talk) 01:01, 31 March 2019 (UTC)[reply]

Outcome of discussion?[edit]

@Erutuon, Fay Freak, Sgconlaw, TheDaveRoss All of you expressed a preference for having a space before "of" in the short form, and no one expressed the opposite preference, hence I will implement {{abbr of}}, {{init of}}, {{clip of}}, {{syn of}}, etc. instead of {{abbrof}}, {{initof}}, {{clipof}}, {{synof}}, etc. What's a little less clear to me is short forms of the multi-word templates. Logically, I think you guys would prefer {{alt form}}, {{alt case}}, {{alt sp}}, instead of {{altform}}, {{altcase}}, {{altsp}}, is that right? {{missp}} will have to remain as such because "misspelling" is a single word, and I think {{rare form of}} will remain without a shorter form, because {{rare form}} is hardly much shorter. OK? Benwing2 (talk) 20:18, 31 March 2019 (UTC)[reply]

I agree with what you said here, spaces in all of the shortened multi-word templates. If nothing else it will make it simple to recall whether or not there is a space in a particular template, because the answer will be yes. - TheDaveRoss 12:03, 1 April 2019 (UTC)[reply]

I am in favour of separating "of" from the rest with a space, but I have no particular opinion on spaces in the rest of the name. —Rua (mew) 12:54, 1 April 2019 (UTC)[reply]

The only thing I'll add at this time is that IMO we should keep a redirect at {{altform}} because it's cheap and harmless and what some people (me) are used to typing... I don't care if you want to periodically change occurences of it to {{alt form}} (although, at that point, why not switch it to the full name?). - -sche (discuss) 15:30, 2 April 2019 (UTC)[reply]

Of should definitely be a separate word, but I'm unsure about whether {{altform}} or {{alt form}} is better. I guess I don't mind either form, perhaps with a slight preference for the spaced form. — SGconlaw (talk) 16:14, 2 April 2019 (UTC)[reply]

I do use {{neuter of}} which gives the wording "neuter singular of". I would like it kept. DonnanZ (talk) 16:31, 2 April 2019 (UTC)[reply]
- I disagree. It shouldn't have a particular name and then display something else. —Rua (mew) 20:59, 2 April 2019 (UTC)[reply]

I seem to remember your dislike of shortcuts. DonnanZ (talk) 21:09, 2 April 2019 (UTC)[reply]

Ok? Did this somehow become personal? —Rua (mew) 21:11, 2 April 2019 (UTC)[reply]

I agree with Rua. - TheDaveRoss 22:38, 2 April 2019 (UTC)[reply]

Hmm, being forced to use more keystrokes is a disimprovement. DonnanZ (talk) 23:38, 2 April 2019 (UTC)[reply]

@Donnanz {{neuter of}} is the wrong name for a shortcut for this because "neuter of X" is not the same as "neuter singular of X". I have created {{infl of}} as a shortcut of {{inflection of}}; you can replace {{neuter singular of|LANG|foo}} with {{infl of|LANG|foo||n|s}} which is shorter, or if you want I'll create something like {{neut sing of}}. Benwing2 (talk) 02:52, 3 April 2019 (UTC)[reply]

I welcome {{infl of}} which I actually wanted. Swings and roundabouts. DonnanZ (talk) 09:14, 3 April 2019 (UTC)[reply]

OK, I finished the renames and deletions/deprecations. @-sche After renaming uses of {{altform}} I left it as-is for now rather than deprecating it, per your request, although it isn't documented. Benwing2 (talk) 03:18, 3 April 2019 (UTC)[reply]

So just call it {{nsing of}} or something along those lines. Chuck Entz (talk) 03:21, 3 April 2019 (UTC)[reply]

How to properly append/insert a new entry in a new language to an existing page using pywikiot[edit]

I am using pywikibot with -appendbottom currently and providing '----\n' which, however, renders as a string. This is the way suggested in the documentation for pywikibot's pagefromfile script. I've tried a couple of variations but nothing works properly. Is there is a way to insert the new entry in the proper alphabetical order? If not, how does one append the new entry with proper spacing to the end? Sinonquoi (talk) 12:40, 29 March 2019 (UTC)[reply]

I haven't used pywikibot in a long time, but I don't think pagefromfile has this functionality out of the box. Are you willing to do some coding? The method isn't complicated. - TheDaveRoss 13:02, 29 March 2019 (UTC)[reply]

Although I am not a coder primarily, if someone gives me an idea of what to do, I would be able to do it. Sinonquoi (talk) 17:03, 29 March 2019 (UTC)[reply]

It's not really necessary. If you just add the language section to the end of the page it will be cleaned up eventually to the correct format when I run my language section cleanup script. The formatting within the language section is much more important since that won't be touched. DTLHS (talk) 17:06, 29 March 2019 (UTC)[reply]

Appending to the bottom of the page is not the way you want to do it, unless you happen to be working on the language that comes absolutely last in sorting order. And even then, it's possible that the language section is already there and you end up with two sections for the same language Instead, you have to determine if the language isn't already on the page, and if not, find the appropriate place to insert the section. This is best done by parsing the wikitext, using a tool such as MWParserFromHell. Once parsed, you have the contents of the page in the form of well-structured Python objects rather than as a string, which is much easier to work with and makes it a lot easier to edit pages. —Rua (mew) 17:12, 29 March 2019 (UTC)[reply]

You can probably do it without the full parser package, just grab the whole wikitext, split by L2 section, check if yours exists, if not sort the sections and rebuild the page. @Sinonquoi if you are on IRC feel free to hop in #wiktionary if you want to talk about it. - TheDaveRoss 17:17, 29 March 2019 (UTC)[reply]

You can't, it's not that simple and it's much less likely to break something if you just append to the end. DTLHS (talk) 17:22, 29 March 2019 (UTC)[reply]

But that leads to broken entries until someone else fixes them, which is not allowed. —Rua (mew) 17:26, 29 March 2019 (UTC)[reply]

We really need to develop a standard bot framework so everyone doesn't have to reinvent the wheel. I don't seem them as "broken", just unsorted. This is how many bots (such as SemperBlottoBot) have operated for many years. DTLHS (talk) 17:28, 29 March 2019 (UTC)[reply]

What languages are you all using? - TheDaveRoss 17:55, 29 March 2019 (UTC)[reply]

Um, English? —Rua (mew) 18:01, 29 March 2019 (UTC)[reply]

(I think he meant programming language :p ) Python, but I'd be willing to work in anything if we were collaborating and agreed on something else. Python is what the majority of existing wiki editing software is written in however. DTLHS (talk) 19:17, 29 March 2019 (UTC)[reply]

I thought he might mean that, but this is a dictionary... —Rua (mew) 19:53, 29 March 2019 (UTC)[reply]

Maybe it would work to go through each language header and add the new language section immediately before the first header that is greater than the language header you are adding, ignoring the Translingual and English headers (which should be at the top), otherwise at the end. The new language section should have "\n\n----\n\n" between it and any neighboring language sections. (This method assumes the language headers, except for Translingual and English, are in Unicode code point order.) — Eru·tuon 19:15, 29 March 2019 (UTC)[reply]

You need to take into account things that may be at the start or end of the page such as interwikis (yes we still have interwikis), hence, "not that simple". DTLHS (talk) 19:19, 29 March 2019 (UTC)[reply]

Ahh, I see what you're getting at. So it would be useful to have an "add new language section" function that deals with all of this. It could also alphabetize sections as TheDaveRoss proposes, though that would sometimes create a messier diff. — Eru·tuon 19:41, 29 March 2019 (UTC)[reply]

Add it at the end, then append '\n{{rfc-auto}}\n' - this will tell another bot to put it in the right place. SemperBlotto (talk) 09:18, 3 April 2019 (UTC)[reply]

I don't know that there are any bots which are looking for that template any longer, unless @Rua or @DTLHS are, or are willing to include it in their bots. DTLHS will reorder the language sections as part of the cleanup process, but I don't know what criteria are used for choosing which entries to work on. - TheDaveRoss 18:33, 3 April 2019 (UTC)[reply]

I process all entries, there's no selection process other than comparing them before and after the reordering to see if something changed. rfc-auto is removed but it's not used at all to target entries. DTLHS (talk) 05:27, 4 April 2019 (UTC)[reply]

Searches and translations[edit]

When a search is conducted on a non-existent term in the search box at the top right corner of each page and the term appears in a translation table, the search results page shows the line "[non-existent term] is a [language] translation of the word [entry] ("[brief definition of entry]")". I suppose this information is retrieved from {{trans-top}} on the entry page. However, it seems that this doesn't work well when {{trans-top-also}} is used, because this is what results: "gîrokirî is a Kurdish translation of the word abeyant ("also|being in a state of abeyance|suspended")". Hopefully someone can look into this. — SGconlaw (talk) 19:46, 29 March 2019 (UTC)[reply]

Some Can Not Play Mandarin Chinese Audio Files linked in zh-pron for Months (in certain browsers)[edit]

Tracked in Phabricator
Task T130982

"|ma=Zh-zhēnzhū.oga" As used in zh-pron on the 珍珠 page should allow readers a chance to listen to the pronunciation "zhēnzhū".

Mandarin
(Standard)
(Pinyin): zhēnzhū

(Zhuyin): ㄓㄣㄓㄨ

(Dungan, Cyrillic and Wiktionary): җынҗў (žɨnžw, I-I)
Cantonese (Jyutping): zan¹ zyu¹
Hakka (Sixian, PFS): chṳ̂n-chû
Eastern Min (BUC): dĭng-ciŏ / cĭng-ciŏ
Southern Min (Hokkien, POJ): chin-chu / tin-chu
Wu (Wugniu)
- (Northern): ¹tsen-tsy

Mandarin
- (Standard Chinese)⁺
  - Hanyu Pinyin: zhēnzhū
  - Zhuyin: ㄓㄣㄓㄨ
  - Tongyong Pinyin: jhenjhu
  - Wade–Giles: chên¹-chu¹
  - Yale: jēn-jū
  - Gwoyeu Romatzyh: jenju
  - Palladius: чжэньчжу (čžɛnʹčžu)
  - Sinological IPA ^(key): /ʈ͡ʂən⁵⁵ ʈ͡ʂu⁵⁵/
- (Dungan)
  - Cyrillic and Wiktionary: җынҗў (žɨnžw, I-I)
  - Sinological IPA ^(key): /ʈ͡ʂəŋ²⁴ ʈ͡ʂu²⁴/
  (Note: Dungan pronunciation is currently experimental and may be inaccurate.)
Cantonese
- (Standard Cantonese, Guangzhou–Hong Kong)⁺
  - Jyutping: zan¹ zyu¹
  - Yale: jān jyū
  - Cantonese Pinyin: dzan¹ dzy¹
  - Guangdong Romanization: zen¹ ju¹
  - Sinological IPA ^(key): /t͡sɐn⁵⁵ t͡syː⁵⁵/
Hakka
- (Sixian, incl. Miaoli and Meinong)
  - Pha̍k-fa-sṳ: chṳ̂n-chû
  - Hakka Romanization System: ziin´ zu´
  - Hagfa Pinyim: zin¹ zu¹
  - Sinological IPA: /t͡sɨn²⁴⁻¹¹ t͡su²⁴/
Eastern Min
- (Fuzhou)
  - Bàng-uâ-cê: dĭng-ciŏ / cĭng-ciŏ
  - Sinological IPA ^(key): /tiŋ⁵⁵ ^(t͡s-)ʒuo⁵⁵/, /t͡siŋ⁵⁵ ^(t͡s-)ʒuo⁵⁵/
Southern Min
- (Hokkien: Xiamen, Quanzhou, Zhangzhou, General Taiwanese)
  - Pe̍h-ōe-jī: chin-chu
  - Tâi-lô: tsin-tsu
  - Phofsit Daibuun: cinzw
  - IPA (Xiamen, Zhangzhou): /t͡sin⁴⁴⁻²² t͡su⁴⁴/
  - IPA (Quanzhou): /t͡sin³³ t͡su³³/
  - IPA (Taipei, Kaohsiung): /t͡sin⁴⁴⁻³³ t͡su⁴⁴/
- (Hokkien: Xiamen, Quanzhou, Zhangzhou, variant in Taiwan)
  - Pe̍h-ōe-jī: tin-chu
  - Tâi-lô: tin-tsu
  - Phofsit Daibuun: dinzw
  - IPA (Quanzhou): /tin³³ t͡su³³/
  - IPA (Xiamen, Zhangzhou): /tin⁴⁴⁻²² t͡su⁴⁴/
  - IPA (Taipei, Kaohsiung): /tin⁴⁴⁻³³ t͡su⁴⁴/
Wu
- (Shanghai):
  - Wugniu: ¹tsen-tsy
  - MiniDict: tsen^平 tsy
  - Wiktionary Romanisation (Shanghai): ¹tsen-tsr
  - Sinological IPA (Shanghai): /t͡sən⁵⁵ t͡sz̩²¹/

The above zh-pron doesn't allow me (and User:Tooironic and others?) to play the audio file. I tried it on my laptop using Opera and Firefox in Safe Mode. In my case, where there should be a play button etc, there's just a pale grey bar with nothing on it. This has been going on for months at least- I noticed this in December, but I thought it was just something wrong with my computer, because I can play the file on my phone using the generic browser.

Disaster.

There's nothing wrong with Wikimedia's Zh-zhēnzhū.oga file- it can be played perfectly on the zhēnzhū page when used in Template:audio in Opera.

audio (file)

--Geographyinitiative (talk) 10:29, 30 March 2019 (UTC)[reply]

This has been an issue for more than six months, and I have found it very frustrating. @Justinrleung, suzukaze-c know about it, but I guess they don't know what to do? —Μετάknowledge^{discuss/deeds} 20:00, 30 March 2019 (UTC)[reply]

It used to cover the IPA. A few days after I fixed that (using a hack I figured out in the past), the damn buttons stopped showing up. I've said this before, and I will stay it again: I hated the thing since the first time I ever spotted it on Wikipedia. Old revisions of the relevant modules from before I added my hack don't work either. —Suzukaze-c ◇◇ 21:16, 30 March 2019 (UTC)[reply]

@Geographyinitiative, Metaknowledge, Suzukaze-c: If I recall correctly, the problem has to do with its interactions with the collapsible element. What if we display the audio in the uncollapsed part, i.e. after the romanization in the main part instead of after the IPA in the collapsed part? — justin(r)leung _{{ (t...) | c=› }} 00:31, 31 March 2019 (UTC)[reply]

I love this idea. No need to hide our lamp under a bushel. This way, people who are visiting Wiktionary for the first time and might want to hear the pronunciation can actually SEE that we provide a playable audio file without having to click into and look through all the complicated academic stuff in zh-pron. --Geographyinitiative (talk) 01:05, 31 March 2019 (UTC)[reply]

In the Developer Tools under 'Console' in the web browser, Firefox shows PlayerControlBuilder:: Not enough space for control component five times on https://en.wiktionary.org/wiki/珍珠?debug=true . So the audio player is rendered but within that audio player area there is not enough space to render pause, volumeControl, timeDisplay, options and timedText. Feel free to follow mw:How to report a bug and file a ticket in Phabricator under the "TimedMediaHandler-Player" project tag to make developers aware of this. Thanks! --AKlapper (WMF) (talk) 15:28, 5 April 2019 (UTC)[reply]

@Justinrleung, suzukaze-c: In the mean time, can someone move the audio as Justin suggested? —Μετάknowledge^{discuss/deeds} 16:56, 8 April 2019 (UTC)[reply]
@Metaknowledge, Suzukaze-c, Geographyinitiative: Done, but I'm not sure if this is how we want it to look. Any suggestions on how it should display would be much appreciated. — justin(r)leung _{{ (t...) | c=› }} 21:21, 8 April 2019 (UTC)[reply]

I love this new look. We have just made the dictionary way more friendly to Chinese as a second language studiers. If you have to change it back, I understand. But I am loving it. --Geographyinitiative (talk) 23:57, 8 April 2019 (UTC)[reply]

With the new change, I just listened to four Chinese dialect pronunciations for Taiwan within a matter of about ten seconds. https://en.wiktionary.org/wiki/臺灣 I had NEVER done that before. --Geographyinitiative (talk) 01:37, 9 April 2019 (UTC)[reply]

God this is awesome.　Ｐｌｅａｓｅ don't change it back. It's so convenient. --Geographyinitiative (talk) 01:10, 17 April 2019 (UTC)[reply]

@Geographyinitiative, Justinrleung: It seems like the new video player can work with collapsed elements now (preferences). (Perhaps we shouldn't re-hide the audio though.) —Suzukaze-c ◇◇ 07:09, 24 June 2019 (UTC)[reply]

I like having it out there right in your face so that new readers can see that we have that kind of functionality. If it needs to be put back though, I'm not going to oppose it too much. I wish some kind of statistics could be generated about use of the audio files and page views. I would believe that the audio files attract readers and keep them coming, but idk. In the past, I would often click on the audio files in Pleco, but idk if other people do it. --Geographyinitiative (talk) 09:35, 24 June 2019 (UTC)[reply]

Finding the lemma in Module:la-noun and Module:la-adj[edit]

@JohnC5, Erutuon I added support for WT:ACCEL to Module:la-noun and Module:la-adj, but I'm having trouble with one detail. Latin uses diacritics that don't appear in the page name, and those need to be specified in the lemma = value in the module, which is currently set to nil. Because I don't know much of how the module works, I'm not sure where to get the lemma from. If I hardcode it as the nominative singular, then of course it won't work for plural tantum nouns. But there may also be nouns out there that have no nominative forms at all, so I'd rather not make assumptions that silently break things. —Rua (mew) 13:48, 30 March 2019 (UTC)[reply]

There are also some Greek words whose lemma isn't the nominative singular, like ἐμαυτοῦ (emautoû), σεαυτοῦ (seautoû), ἑαυτοῦ (heautoû). My idea was to look at the (masculine) nominative singular, then if that doesn't exist, the (masculine) nominative plural, then the (masculine) genitive plural, then the (masculine) genitive singular, and maybe go through more of the case-forms in a certain order and use the first form that exists. I don't think that logic is quite right, but it's a starting point. — Eru·tuon 18:05, 30 March 2019 (UTC)[reply]

Looking for templates that still accept only `|lang=` for the language code, not `|1=`[edit]

I have been systematically converting all templates that take a language code to accept |1= as well as |lang=. I have been tracking the templates and the language-code params they accept on this page: Wiktionary:Templates with current language parameter. There are only a few left that I know of that don't accept the language parameter in |1=; mostly this is because the |lang= parameter is optional, and before we allow |1= to contain a language code, we have to make the language code mandatory and do bot runs to add the language code everywhere it's missing. Please feel free to update that page with any other templates that are missing, or ping me directly about converting a template. Benwing2 (talk) 06:56, 31 March 2019 (UTC)[reply]

Redirecting `{{der2}}`/`{{der3}}`/`{{der4}}`/..., `{{rel2}}`/`{{rel3}}`/`{{rel4}}`/..., `{{syn2}}`/`{{syn3}}`/`{{syn4}}`/... etc. to `{{col2}}`/`{{col3}}`/`{{col4}}`/...[edit]

Currently we have a whole slew of similar templates to create multicolumn output:

{{der1}}, {{der2}}, {{der3}}, {{der4}}, {{der5}}
{{der2-u}}, {{der3-u}}, {{der4-u}}, {{der5-u}}
{{rel2}}, {{rel3}}, {{rel4}}, {{rel5}}
{{syn2}}, {{syn3}}, {{syn4}}
{{ant2}}, {{ant3}}, {{ant4}}
{{hyp2}}, {{hyp3}}, {{hyp4}}
{{hyp4-u}}
{{coord2}}
{{desc3}}

These used to differ in the default title displayed, but per the outcome of Wiktionary:Beer parlour/2018/November#Titles of morphological relations templates, the title is no longer displayed, and as a result {{der2}}/{{rel2}}/{{syn2}}/{{ant2}}/{{hyp2}}/{{coord2}} all do *exactly* the same thing.

I would like to redirect all of these to a single set of columnar templates:

{{col1}}, {{col2}}, {{col3}}, {{col4}}, {{col5}} for the default auto-sorting variants
{{col1-u}}, {{col2-u}}, {{col3-u}}, {{col4-u}}, {{col5-u}} for the default non-auto-sorting variants

Note that all of these templates accept a |sort= parameter that can be used to override the auto-sorting behavior.

Also, the underlying code in Module:columns accepts an invocation parameter |class= to specify a CSS class, which could be used to add classes "derived-terms", "related-terms", etc. to the HTML output, allowing users to customize the behavior e.g. of derived vs. related terms differently. Currently this feature is unused, but if we used it, we wouldn't want to do all the redirections. However, it's not at all clear to me it makes sense to add this capability ... these are fundamentally similar lists of terms and it seems of questionable utility to make it possible to display them differently, esp. considering the complexity it adds vs. having a single set of columnar templates.

Thoughts?

Benwing2 (talk) 19:57, 31 March 2019 (UTC)[reply]

I was going to suggest this, so

Support. However, I also think the sorting behaviour should be removed altogether. It needlessly wastes processing time whenever a page is loaded, whereas sorting the terms manually is something an editor can do once and then nobody has to worry about it again. If people really want a way to sort terms, then it can be done with a substable template that receives a language and any number of unnamed parameters, and then is substed into the same parameters but sorted, separated by pipes so that they fit syntactically into the surrounding template. Thus, when you save a page containing this:

{{col2|en|{{subst:sort|en|foo|bar|baz}}}}

it automatically expands into this:

{{col2|en|bar|baz|foo}}

This way, there is no need for sorting variants of the templates, yet editors have the option to sort anytime they add new items to the list. —Rua (mew) 21:15, 31 March 2019 (UTC)[reply]

@Rua: I created a substable sorter based on the code formerly in Module:columns awhile ago: see Module:collation. — Eru·tuon 22:02, 31 March 2019 (UTC)[reply]

That's a good idea, but it needs to be more practical to use. The syntax {{sort|lang|term1|term2|term3}} is easier. —Rua (mew) 22:49, 31 March 2019 (UTC)[reply]

Yeah, not very ergonomical, but I didn't think it would be used very often. [Edit: Created {{sort}}.] — Eru·tuon 03:54, 1 April 2019 (UTC)[reply]

Also, another point: the number of columns should be a parameter, so we don't need different templates for different numbers of columns. —Rua (mew) 23:06, 31 March 2019 (UTC)[reply]

Support, and I agree with Rua. Chignon – Пу чок 23:07, 31 March 2019 (UTC)[reply]

Redirected. I also created a generic template {{col}} that takes the number of columns as a parameter, after the language code and before the terms. The templates {{col}} and {{col1}} .. {{col5}} auto-sort; {{col-u}} and {{col1-u}} .. {{col5-u}} don't. I really don't think the time to sort is worth worrying about; sorting is O(N log N), which is effectively linear for normal-length tables, and is not where most of the time goes. Benwing2 (talk) 01:44, 1 April 2019 (UTC)[reply]

Well I think we're more concerned about memory usage than runtime in this case. DTLHS (talk) 01:47, 1 April 2019 (UTC)[reply]

Support. — SGconlaw (talk) 03:21, 1 April 2019 (UTC)[reply]

@Benwing2: Referring to this thread, I wanted something like {{col1-u}}, so I have experimented with it at export. I'm not sure that I like the result, and I may need to refine it. I would have preferred it to be completely collapsible. DonnanZ (talk) 09:11, 2 April 2019 (UTC)[reply]

Oppose making it completely collapse.

Support showing a number of lines in the collapsed state. Wrapping a list of terms in one of these templates should never result in hiding all the terms, that is a regression in the usefulness of the template because the user now has to expand the list before seeing any terms. Instead, the purpose is to hide additional terms once there are too many. —Rua (mew) 11:42, 4 April 2019 (UTC)[reply]

Showing a number of lines decreases its usefulness; in some cases it may not be worth using it if little space is saved in the semi-collapsed state. The present semi-collapsed set-up with these templates is quite stupid and an eyesore. DonnanZ (talk) 22:42, 14 April 2019 (UTC)[reply]