Wiktionary:Grease pit/2024/May

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Lithuanian collation[edit]

I've been cleaning up some coding errors in Lithuanian entries, and I noticed that seven years ago @エリック・キィ had ineffectually added |sort=skesti to the headword line of Lithuanian skę̃sti? What problem was it hoped to address? The word currently gets sorted (except when separated from its tagging as Lithuanian between "ske" and "skė" and in particular before "ski"; it looks as though sorting was much worse when the entry was created. It's possible that the hope was that it would be sorted between Lithuanian skėrys and Lithuanian skėtis as would be done by Mimer SQL. (Notifying Agamemenon, Apisite, BigDom, GabeMoore, Insaneguy1083, Helrasincke, Hippietrail, RichardW57, Sławobóg, 70.175.192.217): . I have removed the ineffectual parameter from the page. RichardW57m (talk) 13:13, 1 May 2024 (UTC)Reply

And we can find successive dictionary entries Lithuanian skersvėjis, Lithuanian skėsti, skęsti, sketera,[1] showing the problems with promoting secondary collating differences to primary differences. --RichardW57m (talk) 14:50, 1 May 2024 (UTC)Reply
Where is the design rational for our current Lithuanian collation? As far as I can tell, if it was controlled from Module:languages/data/2, it was implemented on or after 2 December 2022, in Module:lt-sortkey, which was deleted on 7 January 2023. Essentially - why are secondary collation differences promoted to primary, whereas they are simply ditched in French, so sort French e, é, è and ê the same, but make Lithuanian e, ę and ė sort like completely different letters? Was it a conscious decision? I suspect the decision was taken by @Theknightwho, and it can always be justified by being better than what went before. --RichardW57m (talk) 17:25, 1 May 2024 (UTC)Reply
@RichardW57m What do you mean by "problems with promoting secondary collating differences to primary differences"? Can you clarify? BTW I doubt User:Theknightwho intentionally made a decision to change Lithuanian sorting order. He did a lot of work restructuring the *handling* of sort keys, but AFAIK the intent was to preserve whatever sorting rules were already present. Benwing2 (talk) 23:58, 1 May 2024 (UTC)Reply
@Benwing2 Richard is referring to the Unicode Collation Algorithm, which uses primary, secondary and tertiary weightings (secondary tiebreaks primary, and so on). While I'd very much like us to use the UCA, implementing it would be a lot of work, but it would be a big improvement over the crude sort methods we generally use at the moment.
To answer @RichardW57m's question: the current sortkey isn't based on the UCA - with a handful of exceptions, our sortkey algorithms are very simple. Theknightwho (talk) 00:23, 2 May 2024 (UTC)Reply
@Benwing2, Theknightwho: Well, someone made a change between 28 November 2022 and 15 December 2022 (see [2]), but the history is hidden in the now deleted Module:lt-sortkey. Was it perhaps @Octahedron80? --RichardW57 (talk) 06:48, 2 May 2024 (UTC)Reply
I'm afraid Theknightwho's answer isn't an answer to my question. While our collations seem to be defined entirely by what the UCA would term primary keys, they tend to attempt to approximate the native sorting orders. It hasn't always been done - our Roman script Pali sorting bears no relationship to the usual sort order, for example, being mostly based on the foremost sorting order for each script. Someone should be making a decision on how to do the approximation - but they may not actually understand the subtlety of the secondary level. --RichardW57 (talk) 06:48, 2 May 2024 (UTC)Reply
@RichardW57 There's only one edit to Module:lt-sortkey, when it was created on Dec 1 2022 by User:Theknightwho. It looks like prior to that there was no sorting algorithm defined for Lithuanian. The Lithuanian dictionary at [3] sorts ę as if it were e (e.g. gesti is directly followed by gęsti) but treats ė as a distinct letter, hence gežtis is directly followed by gėbelėti. I think you should figure out the correct sort order according to standard dictionaries, and then we can implement it. Benwing2 (talk) 07:07, 2 May 2024 (UTC)Reply
{[re|Benwing2}} There had been a prior algorithm, but it was subsumed in the stripping of the three stress accents to generate page names. I haven't dug into the history of this stripping, but it may explain some of the oddities of Lithuanian templates if it was a later feature on Wiktionary.
Am I missing a trick with the LKZ web site? I can't see how to get a dictionary page from it, only dictionary entries. Short of ordering books, I could only find Lalis's dictionary and introductory pages. --RichardW57m (talk) 08:40, 2 May 2024 (UTC)Reply
@RichardW57m You can e.g. type just g in the search box and hit enter, and down the left rail you'll see a list of all the entries starting with g, in sorted order. Benwing2 (talk) 08:56, 2 May 2024 (UTC)Reply
@Benwing2: Thanks. Curious. The 'standard' Lithuanian collation as defined by CLDR and demonstrated (today) at [4] has gėbelėti before gežtis. --RichardW57m (talk) 10:19, 2 May 2024 (UTC)Reply
@Benwing2: But remember Theknightwho's assessment above of the UCA (let alone the CLDR Collation Algorithm) being a lot of work. Last time I looked, the latter wasn't clearly defined. --RichardW57m (talk) 10:31, 2 May 2024 (UTC)Reply
This LKZ list may not be reliably sorted. We have four consecutive items gabumas, gabūnas, gabuoti, gaburdalas, but the distinctly non-empty lists of words stating 'bub' and 'būb' do not overlap! Likewise for words starting gabu and gabū. At https://zodynas.vz.lt/terminaiRaidec.php, I found the index headings/links ABCČDEĖFGHIJKLMNOPRSŠTUŪVZŽ, but 'E' and 'Ė' and also 'U' and 'Ū' pointed to the same lists! Possibly that's a case of the author and the software having different ideas about Lithuanian sorting. The list for 'U' and 'Ū' contained examples of both as initial letter. --RichardW57m (talk) 13:10, 2 May 2024 (UTC)Reply
@Benwing2:, (Notifying Agamemenon, Apisite, BigDom, GabeMoore, Insaneguy1083, Helrasincke, Hippietrail, RichardW57, Sławobóg, 70.175.192.217): I will need some help assessing what is a 'standard' Lithuanian dictionary. I've now found what looks like one at https://archive.org/details/lyberis-sinonimu-zodynas-2002/page/106/mode/2up; (Sinonimų Žodynas = "Dictionary of Synonyms") the page linked to makes it rather obvious that 'e', 'ę' and 'ė' are the same in some sense, and page 128 shows the intermingling of the three. Page 244 shows that 'u' and 'ū' have the same primary weight. Page 252 shows sameness for 'a' and 'ą'. Page 147 proclaims sameness for 'i', 'į' and 'y'. Page 528 proclaims the sameness of 'u', 'ų' and 'ū', though a demonstration for u ogonek will need more searching. --RichardW57m (talk) 15:00, 2 May 2024 (UTC)Reply
@RichardW57m Implementing the UCA for general use on Wiktionary would be a lot of work because (a) it presents performance difficulties due to the size of the UCA dataset and complexity of tailorings, and (b) the sortkeys generated by the UCA aren't suitable as Wikimedia category sortkeys since they're numeric, which would mess up category headers. As such, any general implementation will need to solve that issue, too.
Implementing something something which emulates the UCA for one language is probably not too difficult, though (like our other sortkey algorithms) it would only support the relevant diacritics for the language. Theknightwho (talk) 15:17, 2 May 2024 (UTC)Reply
Using the UCA doesn't force the use of the DUCET. To be honest, most of the CLDR tailorings don't look too wonderful when applied to characters not used outside the language's known character range. I suggest the keys be loaded as strings (which is what we do now). The UCA implementation notes admit that the DUCET values are unfit for any system that expect keys to be C strings, and explains how to convert them.
Even now, starting a category display of Lithuanian terms at 'Y' causes the first letter header to be shown as 'I'. I think that sort of thing is inevitable for any letter that sorts out of code point order. --RichardW57m (talk) 16:10, 2 May 2024 (UTC)Reply
I finally found the demonstration of 'u' and 'ų' having the same primary key in the sequence 'skùsti', 'skų́sti', 'skų́stis', 'skùtas' on paɡe 440. Word internal 'ų' is inconveniently infrequent.--RichardW57 (talk) 21:21, 2 May 2024 (UTC)Reply
The only CLDR collation for Lithuanian gives "Bronius Piesarkas: Lithuanian-English Dictionary ISBN 9986-465-56-7" as its source, and this collation defines ogonek on 'a', 'e', 'i' and 'u' not to make a difference in primary weight, 'e' and 'ė' not to differ in primary weight, 'i' and 'y' not to differ in primary weight, and 'u' and 'ū' not to differ in primary weight. The only weird thing about this collation is that it defines the accentuation marks (acute, grave and tilde) to make secondary differences, and thus be more significant than capitalisation, which I don't believe. It's also saying that, inter alia, a difference of accent on the first syllable would outweigh a difference between 'e' and 'ė' in a subsequent syllable. That's probably refutable. --RichardW57 (talk) 22:02, 2 May 2024 (UTC)Reply
@RichardW57 I hope this might be what you're looking for (excerpted from Ambrazas et al):
In dictionaries and other lists of words arranged in alphabetical order, a and ą, e, ę and ė, i, y and į, u, ū and ų are treated as if they were identical letters, even though they represent different sounds. Therefore the following alphabetical order is customary: aržùsąsàasamblė́ja, ẽstiė́stièstiškas, įkéltiìkraiýlaìlgas.[2]
This appears to be confirmed by, for example, the Lyberis Lithuanian-Russian dictionary where gadýnė is followed by gadìnimas, gabėndinti is followed gabẽnimas (under which we find e.g. gabénti, gabéntis).[3] Helrasincke (talk) 22:15, 4 May 2024 (UTC) Helrasincke (talk) 22:15, 4 May 2024 (UTC)Reply
@Helrasincke @RichardW57 Yeah, this is easy enough to implement, if no one objects I'll go ahead and do that. Benwing2 (talk) 00:19, 5 May 2024 (UTC)Reply
The only question is whether this is what people would prefer, as opposed to government directive. I've been surprised by the evidence that some would prefer to treat 'e' v. 'ė' and 'u' v. 'ū' as primary differences. We did have someone object to the CLDR treating 'i' v. 'y' as a secondary difference. The objection was dismissed as being based on ignorance, but I can't help wondering if there is a folk collation. --RichardW57 (talk) 14:06, 5 May 2024 (UTC)Reply
Curiously enough, Cathy Wissink and Micha Kaplan get Lithuanian 'i' v. 'y' wrong on p11 of https://www.infitt.org/ti2002/papers/03WISSIN.PDF - carelessness, ignorance or misinformation? --RichardW57 (talk) 14:35, 5 May 2024 (UTC)Reply
@RichardW57 That is an interesting find. To be honest, it seems like they're right insofar as y does immediately follow i (as opposed to preceding it), but only when all else is equal (see below, I presume this is the secondary collation in action, though I have to admit I'm still getting my head around primary vs. secondary). I've just checked the Lithuanian Root List and it has the following to say:
The morphemes in the root, prefix, and suffix lists are presented in Lithuanian alphabetical order. As in Lithuanian dictionaries, the letter y (designating long /i:/) is treated alphabetically as a variant of i, and in cases where otherwise identical roots differ only in initial letter i vs. y, the root with initial i is listed first (e.g., ir-1, yr-1 ‘come apart’)[4]
So on that front, Wissink & Kaplan do appear to get that detail quite wrong. Thus Vakareliyska gives: bur- — būr- — būrb- — burn-; drambl- — drąs- — drask-, ėd- — egl-, yd- — iešk- ... ir- — yr- — irz-; ūg- — ugd-. In this sense Ambrazas's statement (but not the examples) they are treated identically is also not entirely correct (or has perhaps been mistranslated). It appears the secondary order is: base letter > y/macron(?) > ogonek > dot. The only one I have my doubts about is the secondary order of y and į as well as ū and ų since Vakareliyska organises the heading on p.21 and p.56 respectively as "I, Į, Y" and "U, Ų, Ū" (compare "A, Ą" and "E, Ę, Ė") and I actually can't find any minimal examples to confirm or deny this, especially since from my understanding ų is quite rare outside of inflections. Helrasincke (talk) 21:19, 8 May 2024 (UTC) Helrasincke (talk) 21:19, 8 May 2024 (UTC)Reply
The secondary order I find plausible is base, base + ogonek, base + length marking, which is consistent with the CLDR. The instances of base + length marking are 'ė', 'y' and 'ū'. That's also the ordering Wktionary currently promotes to primary, and the ordinary presented for longer versions of the alphabet. 'Dot above', except as part of ė, is probably less significant than capitalisation. --RichardW57 (talk) 07:22, 11 May 2024 (UTC)Reply
I put out feeler about the reality of Lithuanian collation rules, and got an interesting indirect response from a Lithuanian. He said that the diacritics (and 'i' v. 'y'), other than those for accentuation), create what are treated as different letters and are sorted as such. When he then compared the system he described with a dictionary, he came back with the response that the ordering in the dictionary was 'weird'! (The intermediary was Bradn on the Zompist Bulletin Board.) On this basis, what we have at present is just fine! --RichardW57m (talk) 13:58, 15 May 2024 (UTC)Reply

References[edit]

  1. ^
    1915, Antanas Lalis, A Dictionary of the Lithuanian and English Languages[1], Chicago, page 325
  2. ^ Vytautas Ambrazas et al (2006) [1997] Lithuanian grammar, 2nd edition, pages 15-16
  3. ^ Antonas Lyberis (2005) Lietuvių-rusų kalbų žodynas [Lithuanian Russian Dictionary], page 182
  4. ^ Cynthia M. Vakareliyska (2015) A Lithuanian Root List, page 6

--RichardW57m (talk) 14:50, 1 May 2024 (UTC)Reply

Strange behaviour of translation template[edit]

The translations for verb play, sense "deal with a situation in a diplomatic manner", display OK, as you will see. However, if everything else in the "translations" section is deleted, leaving only the "deal with a situation in a diplomatic manner" part, then the translations become corrupted with funny characters, stuff like "Finnish: ⦃⦃t+¦fi¦hoitaa¦¦¦¦¦¦¦¦¦⦄⦄"etc., as you can see here in a test edit that I have now reverted. (Of course, I do not actually want to delete everything else in the section. I wanted to make another edit, which went wrong for the same reason, and in trying to work out where the problem lay, I successively deleted other sections, until nothing else was left, to arrive at the minimal case that exhibited the problem.) Any ideas? Mihia (talk) 17:39, 1 May 2024 (UTC)Reply

@Mihia This is not a bug. The {{tt}} and {{tt+}} templates need to be surrounded by {{multitrans}} in order for them to work and not display "funny characters" (as you say). If you delete the surrounding call to {{multitrans}}, you need to convert {{tt}} back to {{t}} and {{tt+}} back to {{t+}}. But a better solution is just to not do this. You can also put the call to {{multitrans}} around each translation section instead of once around all of them, but that will partly negate the memory-saving benefits of {{multitrans}}. Benwing2 (talk) 23:53, 1 May 2024 (UTC)Reply
To expand on Ben's point: the reason we originally used {{multitrans}} was to avoid hitting the 50MB memory limit on large pages (which is no longer much of a concern since the limit got raised to 100MB a few months ago), but it's still very useful because it helps with page loading times as well. water/translations would be totally unusable without it, for example. Theknightwho (talk) 00:26, 2 May 2024 (UTC)Reply
Thanks. The way this is presently laid out, the start of "multitrans" is embedded within one "trans" block, and the end of it, "}}", I think I can now see, is embedded within another, so I must say that it is non-obvious to editors what is going on. It looks "obviously" as if each block is self-sufficient, and, e.g. that they can be reordered, but of course when I moved one block to the end this actually took it out of "multitrans" and broke it. Is there any way to lay this out more clearly? E.g. can "multitrans" be started at the very top on a separate line, and ended at the very bottom on a separate line? Mihia (talk) 09:29, 2 May 2024 (UTC)Reply
@Mihia That was a kludge because the translation-adder originally freaked out if you put the multitrans opening at the top, and no-one wanted to touch it if we didn't have to since the translation-adder's quite finicky. That bug's since been fixed, but lots of pages still do the original workaround as no-one's sent a bot round yet. It wasn't seen as a priority originally, since multitrans was only used on a handful of pages, but as memory issues became worse it eventually came to be deployed on hundreds of pages. Theknightwho (talk) 15:20, 2 May 2024 (UTC)Reply
I guess I understand from "since been fixed", then, that it is OK now to wrap "multitrans" around the whole section? (I can see that this does seem to work, but I wouldn't necessarily know if it is achieving the full memory-saving benefits.) Anyway, I've done it now, so please let me know if it's a problem. Thanks. 17:32, 2 May 2024 (UTC) Mihia (talk) 17:32, 2 May 2024 (UTC)Reply
@Mihia Yes, AFAIK what you are suggesting is completely OK now. Benwing2 (talk) 18:58, 2 May 2024 (UTC)Reply

Occitan template requests[edit]

Can anyone create Occitan past participle template and other essential templates?. other Romance languages such as Asturian, Catalan, French, Galician, Italian, Portuguese, Spanish have already had past participle template. Thank you in advance. Flummont (talk) 02:06, 2 May 2024 (UTC)Reply

I guess you could make {{oc-pp}} as a copy of {{ast-pp}} to start with, as that does not use Lua and should be easy to adapt. But I can see that Occitan morphology is not always as simple as adding letters onto a fixed stem. Ultimately what needs to be created is Module:oc-headword, as a copy (😢) of Module:ca-headword or similar, with changes to the Catalan-specific logic there. @Benwing2 is the expert here. This, that and the other (talk) 02:49, 2 May 2024 (UTC)Reply
@This, that and the other Conceptually this is not so hard, but unfortunately I don't know that much about Occitan; and a complicating factor is the 6 or so different dialects, each of which (conceivably) forms its feminine and plural according to its own rules. Benwing2 (talk) 03:03, 2 May 2024 (UTC)Reply
Looking at oc:parlar#Conjugason, dialectal differences may not be an issue for past participles (at least in these four dialects, and for regular first group verbs). I was going to say "I'm sure Flummont knows more", but it seems that this editor does not actually speak Occitan. This, that and the other (talk) 04:13, 2 May 2024 (UTC)Reply
I took a look at the conjugations there. Unfortunately they only have Lengadocian conjugations for -ir/-er/-re verbs but the regular ones seem to be like parlar. However, the irregular ones (e.g. oc:Template:Conjugason/oc/leng/-odre-t, oc:Template:Conjugason/oc/leng/-dire) look to be more complex and probably differ dialect-to-dialect. Benwing2 (talk) 04:24, 2 May 2024 (UTC)Reply

Two issues with transliteration categories[edit]

This, that and the other (talk) 02:32, 2 May 2024 (UTC)Reply

@This, that and the other I agree with your first statement. I'll have to look into what's going on with honey. Benwing2 (talk) 03:08, 2 May 2024 (UTC)Reply
@Benwing2 I guess the hidden cat issue could be fixed by changing Module:category tree/poscatboiler#L-448 to return false, but I don't really understand the logic here. Umbrella categories don't contain entries, so why would they ever need to be hidden? This, that and the other (talk) 04:06, 2 May 2024 (UTC)Reply
@This, that and the other I think I wrote that code more or less mechanically. But in this case making that change wouldn't fix the issue because the 'Requests for ...' categories are all raw, so the first arm of the if-statement would apply. We need to make a change somewhere in Module:category tree/poscatboiler/data/entry maintenance, which generates its own umbrella categories, to not hide such categories. Benwing2 (talk) 04:16, 2 May 2024 (UTC)Reply
Maybe [5] will fix it. This, that and the other (talk) 04:40, 2 May 2024 (UTC)Reply
@This, that and the other Looks good to me. Benwing2 (talk) 04:58, 2 May 2024 (UTC)Reply

Lua Modules variable sized arguments and the arg magic variable[edit]

Several Lua Module have been "corrected" or modified and returned back to the Lua 5.0 old way of dealing with variable sized arguments. As Scribunto currently uses Lua 5.1 and efforts are ongoing that may update the Lua engine to new versions, I think we should stick to more modern ways to deal with varargs functions.

This is also important for me as I extract data from wiktionary and the Lua version I rely on does not understand this jargon anymore.

@Theknightwho Module:languages now uses the (soon obsolete) arg var in :

function export.addDefaultTypes(data, regular, ...)
	local n = arg.n
	local types = n > 0 and concat(arg, ",") or ""

Please, could you change it to :

function export.addDefaultTypes(data, regular, ...)
   local arg = {...}
   local n = select( '#', ... )

As a quick fix ?

The same goes on with Module:scripts which has been "corrected" recently and introduced the same problem:

for i = 1, arg.n do
    if not types[arg[i]] then 

from the more modern (older) version:

for _, type in ipairs{...} do
   if not self._type[type] then

Also please consider modernizing the Module:tables function:

function export.append(...)
	local ret, n = {}, 0
	for i = 1, arg.n do
		for _, v in ipairs(arg[i]) do

Thanks in advance, Dodecaplex (talk) 21:14, 3 May 2024 (UTC)Reply

@Dodecaplex The instances where I’ve implemented this have been where it maximises performance, and I would not support changing that. The workaround that you suggest generates a performance hit, which is unacceptable for functions which are called many thousands of times, as happens with memoisation.
I really don’t see the point in caring about what is officially deprecated in Lua 5.1 while there is no chance of the functionality disappearing in the near future. I appreciate that it creates awkwardness for you, but the solution is for you to use a Lua 5.1 binary. Theknightwho (talk) 21:16, 3 May 2024 (UTC)Reply
@Theknightwho As Lua 5.1 is used in Scribunto, then arg is already deprecated.
In Lua 5.1, using arg in the function WILL imply a new table creation which will have a cost.
While using ... will avoid the creation of the magic table named arg (which is always null in Lua 5.1 when varargs are to be used). See for instance this stackoverflow answer.
My suggested changes where just a workaround, but if you want to avoid performance hits, then directly use the ... which contains directly the args (no overhead for arg table creation:
select('#',...) instead of arg.n,
select(1,...) instead of arg[1],
select(2,...) instead of arg[2], etc. (well, it is not that easy as indeed select(2, ...) will give the unpacked table from position 2, see details in the cited question)
This will have a better performance impact and will last longer in the long term... Dodecaplex (talk) 21:57, 3 May 2024 (UTC)Reply
@Dodecaplex If you look at the instances where it's been used, a table has to be created anyway. It's possible to iterate with select, but that rapidly becomes much slower than iterating over a table. I appreciate it's deprecated in Lua 5.1, but that is a completely academic point, really - in practical terms, it just means it's not portable to Lua 5.2 or higher, but Wiktionary doesn't use 5.2 and there's no chance it will for the foreseeable future. If and when that changes, we can adapt the code. Theknightwho (talk) 23:16, 3 May 2024 (UTC)Reply

Need help with creating Module:number list/data/vi[edit]

Hello guys! I've been planning on expanding the cardinal box of the Vietnamese cardinal number entries by creating Module:number list/data/vi based on the data given in the English Wikipedia article Vietnamese numerals. Similar to Module:number list/data/ko, the number list should include both the native Vietnamese and Sino-Vietnamese transliterations. Does anyone here know how to do this? Thanks in advance! ChemPro (talk) 05:53, 4 May 2024 (UTC)Reply

@ChemPro I can try to help you with that, but first you should try to make the module by copying the Korean one and filling in the corresponding Vietnamese forms. Benwing2 (talk) 18:48, 4 May 2024 (UTC)Reply
@Benwing2 Done! --ChemPro (talk) 08:04, 5 May 2024 (UTC)Reply
@ChemPro What you did was about half done, but I tried to fix it up. It still needs some work. Benwing2 (talk) 00:11, 6 May 2024 (UTC)Reply
@Benwing2 Hello, thanks for fixing the error. I have some remarks on what still needs to be improved or implemented:
  • For numbers which include the digit 1 from 21 to 91, the number 1 is pronounced as mốt.
  • Even though năm chục (50) is an alternative form to năm mươi, it is not used to construct numerals from 51 to 59 (the same principle applies to all the multiples of ten from 20 up to 90)
  • When the number 5 appears after 10 in the unit digit, the pronunciation changes from năm to lăm --ChemPro (talk) 07:42, 6 May 2024 (UTC)Reply
  • When the number 4 appears after 20 in the unit digit, pronunciation changes from bốn to --ChemPro (talk) 07:52, 6 May 2024 (UTC)Reply
    @ChemPro OK, thanks. I'll get to this tomorrow, going to sleep now :) ... Benwing2 (talk) 08:01, 6 May 2024 (UTC)Reply
Some additional informations to the rules mentioned above:
  • Exceptions to the rule of changing the pronunciation from năm to lăm are numbers ending in 05 (such as 105, 605, 9405, 39605).
  • In some Vietnamese dialects, the number seven (bảy) it also read as bẩy. I tried to implement it, but it somehow caused an error. --ChemPro (talk) 08:36, 6 May 2024 (UTC)Reply

Bot preventing the creation of a redirect[edit]

Wiktionary bot is preventing me from turning this page ناـ to a redirect to نا. Everytime I tried, it's being labeled as a potentially harmful action. The page formerly contained prefix entries for 4 languages. I appropriately moved them to نا. - Ash wki (talk) 10:08, 4 May 2024 (UTC)Reply

@Ash wki: I have done the intended page for you, surely the filter distinguishes us by user rights. But I think admins should give you autopatroller, I have observed meticulously clean and reasonable edits in you. Note that according to the discussion Wiktionary:Beer parlour/2024/April § Arabic-script affixes @Benwing2 wanted to move affixes, probably to contain the character ـ, anyway, so you might want to put in your voice there. Fay Freak (talk) 11:35, 4 May 2024 (UTC)Reply

@Fay Freak Thanks a lot. I'll look into the discussion, thanks. Ash wki (talk) 11:44, 4 May 2024 (UTC)Reply

MY PAGE WONT UPLOAD[edit]

I WAS ONLY GIVING THE LINK TO MY WIKIPEDIA PAGE!!!! Lilly is cool (talk) 17:42, 4 May 2024 (UTC)Reply

@Lilly is cool: that’s exactly why … — Sgconlaw (talk) 17:47, 4 May 2024 (UTC)Reply
If you make it a wikilink, as in [[w:User:Lilly is cool]], the abuse filter won't think you're just linking to some random website. Chuck Entz (talk) 17:57, 4 May 2024 (UTC)Reply

I keep getting logged out[edit]

Ever since I cleared my site data for Wiktionary to fix the translation-adder, I have to log in again on my PC every day. However, my other devices don't have this issue. Any ideas? Aaron Liu (talk) 18:31, 4 May 2024 (UTC)Reply

@Aaron Liu Weird. When you log in, there's a button to remember the login for up to a month or so; do you have that checked? Also I've found that if you log out of any MediaWiki site, it logs you out of all of them. Benwing2 (talk) 18:46, 4 May 2024 (UTC)Reply
I’ve had the same issue today, so it might be a server problem. Theknightwho (talk) 19:05, 4 May 2024 (UTC)Reply
I didn't log out before clearing the data, and I have no problem at all staying logged in on my iPad. Aaron Liu (talk) 19:15, 4 May 2024 (UTC)Reply
@Benwing2 I am still logged in on other wikis, so logging in is instant. I'll try logging out and back in again, thanks. Aaron Liu (talk) 20:46, 5 May 2024 (UTC)Reply

Dialectal variation[edit]

I'm currently trying to improve the state of Polish dialects and subdialects, showing the hierarchy and what-not. Would {{dialect synonyms}} be the best option? The idea isn't exactly to show synonyms, but just different reflexes of the same word. Vininn126 (talk) 06:22, 5 May 2024 (UTC)Reply

What is the difference between what you're describing and an altform? Nicodene (talk) 07:08, 5 May 2024 (UTC)Reply
Synonyms, as I understand it, have different morphemes, alt forms don't; in short these would be alt forms. I'd be wary of placing dozens of these by village in a hard to structure alt forms section, it would be nice to organise subdialects by dialect and to be able to host many easily. But I'm wary as this seems to be for synonyms. Vininn126 (talk) 08:07, 5 May 2024 (UTC)Reply
@Vininn126 This sounds like a job for Descendants. Can you not put them on the proto-page and direct other pages to that page? BTW this is related to my post (maybe in the BP) about creating a generalization of {{alt}}. We may need a generalization of {{desc}}/{{desctree}} as well for these purposes. In general I'm not a fan of the dialect synonyms approach of using a separate module for the actual synonyms/alt forms/etc.; that is the approach used by {{etymtree}} and it didn't work. Benwing2 (talk) 19:47, 5 May 2024 (UTC)Reply
@Benwing2 They are indeed descendants. It could be possible to set up there, but there's also the issue of how to list them on the Polish page itself. The alternative forms section and {{alt}} might lead to a huge mess. Vininn126 (talk) 07:11, 6 May 2024 (UTC)Reply
Soft redirect? Do we need to list every dialectal form on every page? Benwing2 (talk) 07:17, 6 May 2024 (UTC)Reply
@Benwing2 That's not my intention. Dialectal forms would most definitely be listed on the Standard Polish reflex. My issue is how to do that. The forms themselves would also be soft redirects (although this also sometimes gives pause to wonder, as they often don't always have the same definitions. They don't have the same declensions either, but I can make templates for that). The issue is if I have for example lekarz (which it might have?) with 6-10 forms and labels, it might not sound that bad, but I think giving it more structure would aid the reader, i.e. being able to organize subdialects by dialect and also maybe even geographically. Vininn126 (talk) 07:23, 6 May 2024 (UTC)Reply
@Vininn126 It does sound like you want a generalization of {{desc}}/{{desctree}} for use in Alternative forms or whatever. I think structuring it the way that Descendants sections do it would work well. Benwing2 (talk) 08:03, 6 May 2024 (UTC)Reply
I suppose I'll try that once I've done more work on the actual subdialects themselves. Vininn126 (talk) 08:05, 6 May 2024 (UTC)Reply

< > (Unsupported titles/`lt` `gt`): broken labels[edit]

{{lb|mul|Internet slang}} results in “(Internet slang[[Category:Translingual internet slang|]])”. J3133 (talk) 15:27, 5 May 2024 (UTC)Reply

@J3133 On which page? I tried this on a test page and it displays correctly. Benwing2 (talk) 19:42, 5 May 2024 (UTC)Reply
@Benwing2: have you tried clicking on the wikilink in the title? I suspect it has something to do with interaction between characters in the pagename and the wikitext generated by the modules for the categories, though I have no clue whether it's the modules or the js or both that's doing it. Chuck Entz (talk) 20:31, 5 May 2024 (UTC)Reply
@Chuck Entz Thanks, I see it now. Benwing2 (talk) 21:10, 5 May 2024 (UTC)Reply
@Theknightwho Can you please take a look at this? This is happening in makeSortKey in Module:languages. The pagename passed in is < >, which is the actual pagename (rather than the "Unsupported titles" version), and line 1282 removes HTML tags, with the result that the sort key is an empty string, which is why the display shows garbled. I don't know why you are removing HTML tags but it clearly will interact badly with any pagename that looks like an HTML tag or contains HTML tags in it. Benwing2 (talk) 21:32, 5 May 2024 (UTC)Reply
@Benwing2 Yes, this is something I’m aware of but a proper fix may not be straightforward, and may need to wait until the proper rewrite and disentanglement of the links and languages modules is ready. In the meantime, it should be possible to use the original input as a backup if the result is the empty string. Theknightwho (talk) 22:06, 5 May 2024 (UTC)Reply
@Theknightwho Can you code this up? Benwing2 (talk) 22:25, 5 May 2024 (UTC)Reply
Sure. Theknightwho (talk) 00:46, 7 May 2024 (UTC)Reply
@Benwing2 The (short-term) solution was for format_categories in Module:utilities to use data.encoded_pagename from Module:headword/page as the sort base instead of data.pagename, as that effectively instructs :makeSortKey() in Module:languages to treat any formatting characters in the input as literal, which makes sense if they're included in the actual page title. This isn't ideal, though: I don't think Module:languages should be dealing with escapes or handling formatting characters at all, as they should only be dealt with by modules which handle raw inputs and final outputs. Theknightwho (talk) 18:12, 7 May 2024 (UTC)Reply
@Theknightwho OK thanks! What you say makes sense; where would you move the makeSortKey() functionality that deals with escapes and formatting characters? Also under what circumstances can HTML end up in the sort key and need to be removed? Benwing2 (talk) 19:34, 7 May 2024 (UTC)Reply
@Benwing2 If I remember correctly, the original reason I added this is because inputs like {{head|en|foo|head=<sup>bar</sup>}} would push <sup>bar</sup> through :makeSortKey(), but I don't know if that's still the case.
The ideal situation would be for the Module:languages functions like :makeSortKey(), :makeDisplayText() (etc.) to treat their inputs as literal, with any formatting issues being handled by Module:links, Module:usex (or whatever), but that's only feasible if we have a consistent way to handle raw inputs, since they're non-trivial to deal with. This is what the wikitext parser is supposed to do, because it generates a "wikitext object" which can be processed in whatever way is needed while retaining any formatting in the text. However, it's hard to do that while ensuring good enough performance, because it needs a custom regex engine if you want to have string function functionality.
For sortkeys specifically, the solution is just to convert the wikitext object into the display text as a string (i.e. raw input → wikitext object → display text), which effectively amounts to removing any formatting characters, but it doesn't make sense to add all the extra complexity of parsing wikitext just for that. Theknightwho (talk) 19:46, 7 May 2024 (UTC)Reply

Enabling collapsed quotations on Appendix: and Reconstruction: pages[edit]

I've already gone through every page in the Appendix and Reconstruction namespaces to remove bad uses of #*. Could one of our interface administrators change MediaWiki:Gadget-defaultVisibilityToggles.js#L-435 to include namespaces 100 (Appendix) and 118 (Reconstruction) so that quotations can display properly? (Before anyone asks, there are legitimate quotations on reconstructed entries, such as Reconstruction:Proto-Germanic/Harigastiz). Ioaxxere (talk) 16:55, 5 May 2024 (UTC)Reply

The issue is some of our reconstruction only languages actually have quotations, if I understand correctly... Which means they aren't reconstruction only. Vininn126 (talk) 17:00, 5 May 2024 (UTC)Reply
Yeah this is related to the issue with Proto-West-Germanic kamb being attested, which still hasn't been resolved. Benwing2 (talk) 19:40, 5 May 2024 (UTC)Reply
 Done This, that and the other (talk) 10:00, 6 May 2024 (UTC)Reply

"SORRY" abbreviations[edit]

Moved to Wiktionary:Tea room/2024/May#"SORRY" abbreviations

Why additional line under headings in discussion pages?[edit]

Was this announced? Voted on? Imposed by MediaWiki? Or by someone else? DCDuring (talk) 00:43, 7 May 2024 (UTC)Reply

It's coming from MediaWiki updates, which are announced at WT:Wikimedia Tech News/2024 (we really should get Tech News sent to GP imo).
The look might take a bit of getting used to, but I can already see the benefits. For instance, being able to see at a glance how recently each discussion was contributed to, rather than trawling through signatures, is clearly useful – it's proven its value on Wiktionary's mobile view, where it has been visible for some time.
In any event, the announcement advises that you can turn the new look off in the Editing tab of your preferences. This, that and the other (talk) 01:20, 7 May 2024 (UTC)Reply
@DCDuring Personally I think its usefulness outweighs any possible space wastage. Benwing2 (talk) 19:35, 7 May 2024 (UTC)Reply
It really only helps in longer discussions and wastes space otherwise. In any event, the at-a-glance feature at watchlist is more useful and saves, rather than wastes space. 14:23, 7 May 2024 (UTC) — This unsigned comment was added by DCDuring (talkcontribs).
Anything I can turn off is OK with me. DCDuring (talk) 20:19, 7 May 2024 (UTC)Reply
@This, that and the other BTW according the their Tech News 2024-18, they did a one-year A/B test of this feature and analyzed the outcome based on factors like how many replies they see and such. I'm impressed that they put the effort into doing this test (and one year is a very long time for an A/B test, but I understand that this may have been necessary to get statistical significance). Benwing2 (talk) 22:19, 7 May 2024 (UTC)Reply
Yuck. They should have made a legacy toggle in the settings. If someone creates a CSS override that restores the original appearance, please ping me. -- Sokkjō 04:18, 8 May 2024 (UTC)Reply
@Sokkjo, you can turn it off by unticking "Show discussion activity" at the bottom of the "Editing" tab of Special:Preferences. - -sche (discuss) 15:48, 8 May 2024 (UTC)Reply
@-sche: <3 -- Sokkjō 16:50, 8 May 2024 (UTC)Reply
It looks like the stats on how many days or months since the last edit are wrong- late December isn't "4 months ago". Chuck Entz (talk) 15:01, 8 May 2024 (UTC)Reply
@Chuck Entz Late December is between 4 and 4.5 months ago, no? Benwing2 (talk) 21:00, 8 May 2024 (UTC)Reply

I cannot execute a sql file wiktionary dump[edit]

I'm trying to contribute to wiktionary and would like to analyze the category links and make sure they are correct but I'm not able to manipulate the sql files. They are too large and my software crashes. How do you guys do it? This is the file I'm talking about

enwiktionary-latest-categorylinks.sql.gz

located here:

https://dumps.wikimedia.org/enwiktionary/latest/

It is 5 gigs. I tried importing it into Navicat but it did not work. I even tried divided the files into a mere 1/50th of its size and that could not work either. Kylefoley202 (talk) 07:19, 8 May 2024 (UTC)Reply

@Kylefoley202 These are certainly very large files. There are a few ways to proceed:
  • If you just want to issue SQL queries against the categorylinks table, use Quarry.
  • You could set up a local installation of the MariaDB database server on your computer and use the provided command line tools to run the SQL script against your local server.
  • Or else write a simple script (say in Python) to parse the SQL file character by character. Here's a rudimentary script I wrote a while ago that converts a different table's SQL script (pagelinks) to CSV:
I hope this helps. If you want more advice, tell us what it is you want to do with the file. This, that and the other (talk) 11:57, 8 May 2024 (UTC)Reply
Thanks, I already wrote a python script to parse the sql file and I am rolling and in action. I want to work on getting the IPA transcriptions of English words correct for the purposes of perfecting the rhymes categories. I'm a poet after all. I also want to build an app that both outputs sublemmas given a lemma and also output lemmas given sublemmas. I want to correct any mistakes I find in Wiktionary. I need to figure out how to quickly insert edits into wiktionary with software then upload them in a quick and painless manner. Kylefoley202 (talk) 19:50, 8 May 2024 (UTC)Reply
Also, I can't tell whom I'm talking to, that would help. Kylefoley202 (talk) 19:51, 8 May 2024 (UTC)Reply
@Kylefoley202 In terms of making changes to individual pages, you should use pywikibot, maybe in combination with mwparserfromhell to parse the wikitext. My bot scripts [6] do this, e.g. the find_regex.py script lets you download pages using various selectors, and push_find_regex_changes.py lets you upload pages after making changes in a text editor. Benwing2 (talk) 21:37, 8 May 2024 (UTC)Reply
@Kylefoley202: Please read WT:BOT. Before you make mass changes to Wiktionary entries, you need to show the community that you know what you're doing and that you will edit according to Wiktionary standards. You should also be aware that Wiktionary has its own ways of doing IPA that have been developed through discussion and consensus, so you might unintentionally fix some things that aren't really broken. I'm just saying this for the record. I'm sure you're responsible and know what you're doing. Chuck Entz (talk) 14:29, 9 May 2024 (UTC)Reply
This script seems like an incredibly convoluted way of implementing a regex. Ioaxxere (talk) 16:26, 9 May 2024 (UTC)Reply

Periodically post link to appropriate section of WT:Wikimedia Tech News?[edit]

Following up on TT&O's suggestion above: Apparently WT:Wikimedia Tech News/2024 often contains changes relevant to many of us. Would it make sense to post a link the section of the subpage that had the latest posting? The full text would not be necessary and would take up a lot of space. (There have been 19 postings (~weekly) so far this year!) Presumably, anything that someone objected to, didn't understand, etc. would be made a topic. DCDuring (talk) 17:53, 8 May 2024 (UTC)Reply

Another approach would be for someone to red Tech News and try to anticipate which items would be of interest to our active users. DCDuring (talk) 17:58, 8 May 2024 (UTC)Reply
One other idea: right now, and then once at the start of each year, briefly use the same round-robin-esque move technique that keeps each new month's GP on the watchlist of everyone who watchlisted the previous month's GP, to move the year's Tech News page to here so that everyone who watchlists the GP (and is presumably interested in technical matters) also has the Tech News page on their watchlist. If that page is only updated about once a week, compared to the GP being updated many times a day, it should not be a burdensome thing to have in one's watchlist. (There is, of course, also the option of just getting Tech News delivered here, as suggested in the section above and on several prior occasions when it's happened that people failed to notice something which was technically announced in Tech News. This...honestly might be the better idea, since again, one update a week probably won't make the GP so large as to cause problems.) - -sche (discuss) 19:47, 8 May 2024 (UTC)Reply
I think just having the Tech News delivered to the Grease pit would be fine. Benwing2 (talk) 21:38, 8 May 2024 (UTC)Reply

Is it possible to output all of the IPA transcriptions of words with Quarry?[edit]

I want to go through all of the IPA transcriptions of words, check them and make sure they're correct, then correct the ones that are wrong. Is this possible to do with Quarry? I have already written some Python code, that goes through and sort of does that but of course it's not 100% accurate. I'm using one of the files uploaded here, I forget which one: https://dumps.wikimedia.org/enwiktionary/latest/ I realize I should have asked this question before I wrote the code, but, hey, live and learn. Also, I'm not very good with MySQL, in fact I just started learning it yesterday, so if you know of MySQL code that does this please post it. I also have a similar ambition to link all of the sublemmas to their superlemma. Is that also possible? Kylefoley202 (talk) 04:47, 9 May 2024 (UTC)Reply

@Kylefoley202 The IPA transcriptions are contained in the page content itself, which is inside the "pages-articles" dump. This is an XML file. Parsing this content is something a lot of people have spent time thinking about! You might like to explore https://github.com/tatuylonen/wiktextract for instance. This, that and the other (talk) 10:49, 9 May 2024 (UTC)Reply

In SQL Wiktionary category file, all words are in caps and does not distinguish between gas and GAs[edit]

In looping over this SQL file

enwiktionary-latest-categorylinks.sql.gz located here: https://dumps.wikimedia.org/enwiktionary/latest/

in Python (I can't figure out how to use this software  https://quarry.wmcloud.org/) I find the line: 

1. "129690,'English_non-lemma_forms','GAS\\nGAS','2022-12-18 20:54:34','GAS','uppercase','page'"

But you would think that 'gas' is a lemma, not a non-lemmas but then you realize that they are referring to this entry:

2. https://en.wiktionary.org/wiki/GAs

Why does Wiktionary put all words in all_caps on this sheet? How does Wiktionary know how to input that line 1 into website 2? Kylefoley202 (talk) 06:56, 9 May 2024 (UTC)Reply

@Kylefoley202 what you have found is the category sort key, which is capitalised because sorting is case-insensitive. What you have to do is look at the page ID (that number at the beginning) and cross-reference this with the "page" table. See the MediaWiki database schema for more details. (Hint - the table names in the database schema diagrams are links to pages with much more info about the table.) This, that and the other (talk) 10:57, 9 May 2024 (UTC)Reply
Cool, thanks for the reply. Kylefoley202 (talk) 06:32, 10 May 2024 (UTC)Reply
@Kylefoley202 Just FYI, I would recommend not bothering with SQL but just directly reading the "pages-articles" file [7], which contains all the current content (as of the date of the dump) for all pages. It is pretty easy to parse, either using mwparserfromhell or just rolling your own parser. If e.g. you are interested in English IPA transcriptions, you just need to look for pages that have an {{IPA|en|...}} template in them and pull out the IPA content. Benwing2 (talk) 21:03, 13 May 2024 (UTC)Reply
Also, because the file is big, you should use the XML SAX parser that's built into Python (import xml.sax). Benwing2 (talk) 21:04, 13 May 2024 (UTC)Reply

Template:pre categorization is broken when the second component is unlinked[edit]

See Ylimarkku for an example. The category link markup shows up as plaintext. — SURJECTION / T / C / L / 11:22, 9 May 2024 (UTC)Reply

@Surjection My suspicion is that the "term" (i.e. what's normally the link target) is being treated as the empty string somewhere in the process, and this is what gets used as the sortkey, since that's what's causing the category target to fail. There's special handling for the link itself in situations like this, since the alt text gets displayed instead of a failed link to nowhere, but there needs to be something for the sortkey as well. @Benwing2 is much more familiar with the affix module(s) than I am. Theknightwho (talk) 14:09, 9 May 2024 (UTC)Reply
@Surjection@Theknightwho Should be fixed; let me know if you see any more issues with the affix code. Benwing2 (talk) 23:52, 11 May 2024 (UTC)Reply

Edit request to MediaWiki:Common.css[edit]

Add a new line in between lines 27 and 28: min-width: fit-content;, which fixes an issue where headings are cut off or compressed on narrow screens. Maybe @This, that and the other, Benwing2 Ioaxxere (talk) 15:41, 9 May 2024 (UTC)Reply

@Ioaxxere can you share some screenshots or generally be more specific about what the problem is? If it's a MediaWiki issue it should be reported on Phabricator. This, that and the other (talk) 23:58, 9 May 2024 (UTC)Reply
@This, that and the other: See https://imgur.com/a/GT6DE58. I don't think this is a Mediawiki issue as it seems that all the CSS is working as intended. Ioaxxere (talk) 02:32, 10 May 2024 (UTC)Reply
@Ioaxxere Thanks for that. The issue looks to be specific to mobile (see phab:T316670 where a min-width solution is suggested). Note that Common.css is not loaded on mobile, so we have to modify MediaWiki:Mobile.css instead.  Done in [8]. This, that and the other (talk) 04:15, 10 May 2024 (UTC)Reply
@This, that and the other: Something similar happens on desktop when the viewport is extremely narrow: https://imgur.com/a/jWznzhF. Since this admittedly represents an extreme case, your solution is probably good enough (although I think it's better to have fewer lines of code). Also, what makes you say that common.css isn't loaded on mobile? I was able to change the mobile CSS with User:Ioaxxere/common.css. Ioaxxere (talk) 04:31, 10 May 2024 (UTC)Reply
@Ioaxxere The issue on desktop looks different, as the letters don't spill down the page one by one like they do on mobile. In any event it's an extreme case as you say, since narrow viewports are pathological on desktop. As for the non-loading of MediaWiki:Common.css on mobile, see mw:Extension:MobileFrontend#CSS styling. This, that and the other (talk) 05:38, 10 May 2024 (UTC)Reply
@This, that and the other: It turns out that the desktop case isn't so pathological after all — on 𑀏𑀓𑁆𑀓, the floated number box is wide enough to consume the header even on desktop. I strongly recommend that you implement my original request. Also, that documentation is clearly wrong, as it claims that mobile view also doesn't load user-defined stylesheets. If that were the case, I wouldn't have been able to post those screenshots. Ioaxxere (talk) 13:29, 10 May 2024 (UTC)Reply
It turns out that my change causes some problems with talk pages. Could you change the selector to h1, h2:not([data-mw-thread-id]), h3, h4, h5, h6? Ioaxxere (talk) 17:46, 10 May 2024 (UTC)Reply
@Ioaxxere  Done. This, that and the other (talk) 07:00, 11 May 2024 (UTC)Reply
@This, that and the other: The talk page problem hasn't been fixed since MediaWiki:Mobile.css still has the original selector. If Common.css is in fact loaded on mobile, then the Mobile.css code can be safely removed. Otherwise, it should be the same as in Common.css. Ioaxxere (talk) 18:54, 11 May 2024 (UTC)Reply
@Ioaxxere  Done that. This, that and the other (talk) 11:44, 12 May 2024 (UTC)Reply
Thank you! I think everything looks good now. Ioaxxere (talk) 19:46, 12 May 2024 (UTC)Reply
Regarding whether Common.css is loaded for mobile users: at least historically, it definitely wasn't, because over the years we've had to copy various bits of useful code from Common.css to Mobile.css because they were visibly not being loaded on mobile. (It would not surprise me if personal User/common.css pages function differently.) Surely we could just test this, though, e.g. by adding a test class to the main Common.css (but not Mobile.css) that makes things tagged with that class show up e.g. at 300% size and bright green, adding some text tagged with that test class to this discussion, waiting a while for caches to clear and such, and then seeing whether the text is in fact big and green on mobile. - -sche (discuss) 15:26, 12 May 2024 (UTC)Reply
@-sche I'm curious to find out as well. I've tagged this text with the class bigandgreen, so if it looks unusual on the mobile skin then we'll know for sure. Ioaxxere (talk) 19:46, 12 May 2024 (UTC)Reply
OK, and here's some text tagged with more complex styling. - -sche (discuss) 05:16, 13 May 2024 (UTC)Reply
I recall someone mentioning there was an unavoidable 5-minute delay before changes take visible effect: sure enough, for the first 5-7 minutes (both before and after clearing my cache) I was not seeing the text display any differently than normal on either Desktop or Mobile, logged in or logged out, on Firefox or Chrome. However, I do now see the big green text (and slightly more complex vertical and backgrounded/highlighted text) on Desktop, whether logged in or out, on Firefox or Chrome, as expected. I do not see any of that styling when I use the Mobile version of the site in Firefox or Chrome from my computer, nor when I use my actual mobile, so indeed it appears that only Mobile.css and not Common.css is loaded for mobile users (as expected). But I'll leave the test classes in place for another couple days in case it's a caching issue or anyone wants to test anything else, other browsers, etc. - -sche (discuss) 05:34, 13 May 2024 (UTC)Reply
Yeah I have run into that delay too. The message telling you to clear your cache is misleading and we should change it. Benwing2 (talk) 20:45, 13 May 2024 (UTC)Reply
OK, I changed it. Benwing2 (talk) 20:48, 13 May 2024 (UTC)Reply

Wrong categorisation of cites with an untranslated passage[edit]

{{cite-book|en|passage=bla}} puts the page into Category:English quotations with omitted translation, and {{cite-book|fr|passage=bla}} puts the page into Category:French quotations with omitted translation. This is wrong, because these cites do not have an explicitly omitted translation (see the description of the category). Moreover, the first case is doubly wrong, because the passage is English so it does not require a translation. This, that and the other (talk) 05:49, 10 May 2024 (UTC)Reply

@This, that and the other: When I converted {{cite-book}} to use Module:quote, I made it automatically set |t=- if no translation is provided to suppress the "Please add a translation" message for compatibility with the original behavior of {{cite-book}}. I didn't realize that it was generating categories, too. I can add handling for |t=! to suppress the nag and suppress the categorization unless anyone has a better suggestion for handling this. JeffDoozan (talk) 01:12, 11 May 2024 (UTC)Reply
@JeffDoozan This use of |t=! seems a bit hacky and will expose this to the end user (which might not be wanted). What I might do instead is set an additional flag in the call to format_usex() to suppress the request-for-translation category (note that the category is already suppressed if the language is English or Translingual). Benwing2 (talk) 22:03, 11 May 2024 (UTC)Reply
It's causing a similar problem for 'Lithuanian' - Untranslated citing of a journal. --RichardW57m (talk) 11:21, 13 May 2024 (UTC)Reply
@JeffDoozan Are you able to take a look at this in the next couple of days? Benwing2 (talk) 20:43, 13 May 2024 (UTC)Reply
This should be fixed now. I added a 'noreq' param to format_usex() to suppress the request for translation. JeffDoozan (talk) 21:31, 13 May 2024 (UTC)Reply
@JeffDoozan Thanks! Benwing2 (talk) 21:36, 13 May 2024 (UTC)Reply

While things are now better, how do I stop {{cite-journal}} with text being recorded as a quotation? The problem occurs in the use of {{Template:R:xsv:Zinkevičius85}}, which cites an article in Luthuanian that gives a Sudovian word list. Expanding |1=lt by adding |termlang=xsv switched the categorisation from terms with Lithuanian quotations to terms with Sudovian quotations, but these aren't quotations: they're mentions of Sudovian words. --RichardW57 (talk) 07:01, 14 May 2024 (UTC)Reply

@JeffDoozan I think we need an extra cite-specific parameter for this; maybe |noquote=1 or |mention=1 or something. Benwing2 (talk) 07:20, 14 May 2024 (UTC)Reply
Should cites normally be treated as quotations? The term might not appear in the quoted citation at all! --RichardW57m (talk) 13:13, 14 May 2024 (UTC)Reply
I did try both |brackets=1 and |brackets=on, but they seemed to leave it as being a quotation. Additionally, as it's a citation, the author came first, which caused a Wikimedia parsing problem because the author text is a Wikimedia link - the author has a Wikipedia article on the Lithuanian Wikipedia but not the English Wikipedia.
When we come up with a solution, it also needs applying to the reference at panedielis. --RichardW57m (talk) 13:37, 14 May 2024 (UTC)Reply
What should happen when |mention=1 is provided? Does it suppress all categories? Add them to a new category? Change the formatting? If it's just suppressing the categories, would using the existing |nocat=1 to suppress all categorization work? JeffDoozan (talk) 14:10, 14 May 2024 (UTC)Reply

Early access to the dark mode (mobile web, logged-in)[edit]

Hi everyone, as announced in November, the Web team at the Wikimedia Foundation is working on dark (sometimes also called night) mode. Now, we have released the feature for logged-in users of advanced mobile mode across all wikis for testing purposes. But don't worry, the new feature is not disruptive! (See the "known limitations" section below.) It's just important for us to work together with you before we release this feature to a wider audience. Our goals for the early rollout are to:

  • Show what we've built very early. The earlier you are involved, the more your voices will be reflected in the final version
  • Get your help with flagging bugs, issues, and requests
  • Work with technical editors to adjust various templates and gadgets to the dark mode

Go to the project page and the FAQ page to see more information about the basics of this project.

Known limitations of the initial release

  • Currently, dark mode is only available on mobile, for logged-in users who have opted into advanced mode, as an opt-in feature.
  • Gadgets may initially not work well with dark mode and may have to be updated.
  • Our first goal is making dark mode work on articles. Special pages, talk pages, and other namespaces have not been updated to work in dark mode yet. We have temporarily disabled dark mode on some of these pages.

What we would like you to do (the broad community)

If you have questions - ask us! Also, where appropriate, consider linking to the Recommendations for dark mode compatibility on Wikimedia wikis on pages explaining how to define colors in code. Soon, this page will be marked for translation. We would like to emphasize that the recommendations may evolve. For this reason, we are not suggesting to create your local wiki copies of recommendations. At some point, the copy could become different from the original version.

What we would like you to do (template editors, interface admins, technical editors)

When most bugs are solved, we'll be able to make the dark mode available for readers on both desktop and mobile. To make this happen, we need to work together with you on reporting and solving the problems.

  1. To turn it on, use the mobile website and go to the settings part of your menu and opt into advanced mode, if you haven't already. Then, set the color to dark. (Later, we will be allowing the device preferences to set dark mode automatically).
  2. Next, go to different articles and look for issues:
    • If you have noticed an issue with a template but do not know how to fix it
      1. Go to the recommendations page and find a relevant example
      2. If no relevant example is available or you're not sure of the fix, contact us
    • If you want to debug many templates in dark mode
      1. Go to https://night-mode-checker.wmcloud.org/ and identify templates that need to be fixed. The tool flags the top 100 most read articles.
      2. Go to the recommendations page and find a relevant example
      3. If no relevant example is available or you're not sure of the fix, contact us
    • If you want to identify problems beyond the top 100 articles.
      1. Install the WCAG color contrast browser extension (Chrome, Firefox) and visit some articles. Use it to identify problems
      2. Go to the recommendations page and find relevant examples
      3. If no relevant example is available or you're not sure of the fix, contact us
    • If you have a bug report for dark mode that is not related to templates
      1. Take a screenshot of what you are observing.
      2. Contact us. If possible, please write down your browser version and operating system version.

Thank you. We're looking forward to your opinions and comments! SGrabarczuk (WMF) (talk) 15:29, 10 May 2024 (UTC)Reply

Exciting. I tried it just now and notice the following things: as you say, WT:SB reports that it is only available in light mode. On mainspace pages, the few things I could think of to check all worked just as well in dark mode as in light mode. MediaWiki:Gadget-OrangeLinks.js works (in both light and dark mode) to turn a link orange if it is specifically 'tagged' with a specific language which isn't present on the linked page; conversely, whereas on desktop bare [[wikilinks]] like that are sometimes colored orange when they are implied to be pointing to a section which isn't present, on mobile this is not the case (probably just because whatever local en.Wiktionary functionality turns such links orange on desktop has been deemed by us too large or complex to turn on for mobile). In other words, on desktop, both [[suadero]] and {{l|en|matambre}} in this revision of rose meat show up as orange (as neither page has an English section yet), but on mobile only {{l|en|matambre}} is orange (and it is orange in both light and dark view), whereas [[suadero]] is blue. - -sche (discuss) 19:40, 10 May 2024 (UTC)Reply
I guess I don't understand what the big deal is about dark mode. Back in the old old days of TRS-80's and such, "dark mode" was the only thing you got and people hated it, hence the switch to black-on-white text. Benwing2 (talk) 21:46, 11 May 2024 (UTC)Reply
Me neither … — Sgconlaw (talk) 21:48, 11 May 2024 (UTC)Reply
Some issues I noticed at a cursory glance:
  • The coloured backgrounds for {{top3}} and friends need to be inverted. But the text is still perfectly legible.
  • Same for the coloured backgrounds of the content of collapsible boxes like {{trans-top}} and practically all our inflection templates. However, this is not a big deal, as (a) the text is still legible, and (b) the boxes are initially collapsed so do not really interfere with the overall "dark" look of the page.
This, that and the other (talk) 11:41, 12 May 2024 (UTC)Reply

Help needed with Template:RQ:Kojiki[edit]

This initially was for displaying bibliographic details and an optional first parameter for some unspecified text that was simply displayed if present. @Poketalker envisioned it as having an optional numeric first parameter for the volume number, and wrote the documentation that way. When @JeffDoozan converted it to use Module:quote, he assigned the first parameter to the |section= parameter.

The problem is that all but a couple of uses that had a first parameter had non-numeric text in it consisting of things like "Poem 1". When Poketalker added code to convert the section number in the first parameter to a section name, this caused ParserFunction errors in almost the entire transclusion list because the code did arithmetic on non-numeric text. I fixed a handful by adding a pipe to add empty first parameter, then got tired of it and posted on Poketalker's talk page to let them know about the problem. Their attempts at fixing the problem haven't been successful.

So, what can be done about this? After Poketalker's "fix" there are "only" 51 entries in Category:Pages with ParserFunction errors. That's unacceptable. I see a few potential ways out of this.

  1. Restore the code to do arithmetic on the first parameter and have a bot add a pipe in front of all non-numeric first parameters
  2. Figure out some way to check if the first parameter is numeric, and either apply the arithmetic selection code if it is or treat it the same way as the optional second parameter if it isn't.
  3. Get rid of the whole volume number concept and code, since only a couple of entries actually use it.

A secondary question is whether this even needs to use a module. It looks like most of the usage is of this template followed by the actual quotes and their translations (on separate lines using other templates). The following lines could be converted into parameters for an improved template, but without that, the module contributes nothing.

Also pinging @Eirikr, who originally created the template. Chuck Entz (talk) 17:38, 10 May 2024 (UTC)Reply

How to turn off the new thing on talk pages[edit]

e.g. "Latest comment: 22 hours ago | 1 comment | 1 person in discussion | Subscribe". How to get rid of this and go back to normal? Equinox 20:56, 10 May 2024 (UTC)Reply

In "Preferences". I just turned off loads of features, including the crap you mentioned P. Sovjunk (talk) 21:45, 10 May 2024 (UTC)Reply
Specifically, untick "Show discussion activity" at the bottom of the "Editing" tab of Special:Preferences, see above. - -sche (discuss) 22:47, 10 May 2024 (UTC)Reply
Having it give "Latest comment" information in diffs is just bizarre. Not only is there better and more accurate information in the diff header, it treats the diff as the latest comment, even if you're looking at very old diffs. While it's technically true that the diff was (by definition) the latest comment as of the time on the diff, it's utterly pointless and somewhat misleading. Chuck Entz (talk) 01:28, 11 May 2024 (UTC)Reply
@Chuck Entz Where does it give "Latest comment" info in diffs? Can you elaborate? Benwing2 (talk) 21:44, 11 May 2024 (UTC)Reply
@Benwing2: The same place it does when you view a page the usual way. Under where it says "Wiktionary:Grease pit/2024/May: difference between revisions" is a link to the parent page "<Wiktionary:Grease pit". Below that is where the new system says something to the effect of "Latest comment [] hours ago by []". I've turned the new system off, so I can't give you specifics, but the number of hours ago and the author of the content apply to the diff being viewed, not the conversation as a whole. It kind of makes sense if you think of the diff being a snapshot of the state of the page as of the time of the edit, but it's redundant to the diff header:
  • Revision as of 14:44, May 11, 2024 edit undo thank
  • Benwing2 (talk | contribs)
  • (How to turn off the new thing on talk pages: Reply)
  • (Tag: Reply)
The above is the diff header for the edit I'm replying to, minus a couple of admin-only bits. I have a gadget enabled that adjusts time stamps to local time, so you'd have to add 7 hours to get the usual UTC. The new system would say something like "Latest comment 1 hour ago, by Benwing2", ignoring any later comments to the same topic. In this case there aren't any, but I seem to remember seeing a case where someone posted a reply, then someone else posted a reply, with only the first reply reflected when I viewed the diff for the first reply: i.e., if you posted a reply to a given topic, and Sgconlaw posted a reply to the same topic an hour later, it would say "Latest comment 2 hours ago by Benwing2" when I viewed the diff for your edit, but "Latest comment 1 hour ago by Sgconlaw" when I viewed the next diff. Chuck Entz (talk) 22:18, 11 May 2024 (UTC)Reply
@Chuck Entz I see, now, yes. You have sharp eyes; I didn't even notice the "Latest comment" stuff at the very top of the page, only the one underneath individual section headers. Benwing2 (talk) 23:51, 11 May 2024 (UTC)Reply
This is indeed bizarre and useless, from what I can see, which you and the site coders might be interested in due to my attention atypicalities. Not that I would pay attention to it. I was delighted when first seeing the new thing, while I also use Vector 2022 as the site skin. Fay Freak (talk) 23:29, 11 May 2024 (UTC)Reply
Another problem: suppose I want to link to a specific section of a talk page. I think there used to be a table of contents, or a right-clickable link ("copy to clipboard") or something, but now I can't find a way to grab that section link. Equinox 12:16, 12 May 2024 (UTC)Reply
@Equinox: I do have the anchor symbol giving me the link due to a gadget. Preferences → Gadgets → User interface gadgets → ⚓: add copyable anchor links to sections Fay Freak (talk) 12:21, 12 May 2024 (UTC)Reply
Neat. I wonder why it says "undefined" when you hover the mouse pointer over the anchor icon! Equinox 12:23, 12 May 2024 (UTC)Reply
Well, you should wonder, you have habitually coded websites, I assume, this is one of the things one notices then. I don’t read hover texts much. How people of your branch are wont to respect the tooltips in xkcd comics, rather than continuing information being hidden in such a way discontenting them, is beyond me: this is how it always looks to me. It should have a text for text-based browsing I guess, now somebody has to find where to add the source line. Fay Freak (talk) 12:45, 12 May 2024 (UTC)Reply
I hate doing Web sites. I still spell it "Web site" because I am a Victorian Charles Dickens. But nobody wants a bloody Windows desktop app any more. sigh sigh. Equinox 19:52, 12 May 2024 (UTC)Reply

Proper nouns, def=1 but where it's optional[edit]

See Talk:Leopards Eating People's Faces Party, latest thread by @J3133. Sometimes we want to specify "the" (def=1 parameter) but it's optional, not mandatory. Equinox 12:15, 12 May 2024 (UTC)Reply

The quotations without "the" look like broken English to me. Ioaxxere (talk) 19:48, 12 May 2024 (UTC)Reply
Okay but I have definitely seen "optional the" cases other than this one. Equinox 19:51, 12 May 2024 (UTC)Reply
@Equinox @Ioaxxere Use |the=~; it should be documented on the {{en-noun}} page. Benwing2 (talk) 19:55, 12 May 2024 (UTC)Reply

Kyūjitai spellings displaying as orange links[edit]

It seems that MediaWiki:Gadget-OrangeLinks.js is incompatible with {{ja-see}}, as the "Alternative spelling" box on Japanese kanji entries is displaying kyūjitai spellings as orange links despite the Japanese L2 header being present. See 嘘だろ and 尽くした. Binarystep (talk) 01:29, 13 May 2024 (UTC)Reply

I suspect this is happening because those forms aren't categorized as either lemmas or non-lemma forms. Benwing2 (talk) 20:57, 13 May 2024 (UTC)Reply
What's weird is that it only does this with kyūjitai spellings. Other alternate forms, such as hiragana spellings, categorize correctly. Binarystep (talk) 05:05, 14 May 2024 (UTC)Reply
@Binarystep I think to fix this we should make kyūjitai spellings be either lemmas or non-lemmas. I notice for example that romaji spellings are considered non-lemmas and hiragana spellings are considered lemmas. Pinging Japanese editors: (Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, 荒巻モロゾフ, Shen233, Cpt.Guapo, Sartma, Lugria, LittleWhole, Chuterix, Mcph2): . Benwing2 (talk) 05:23, 14 May 2024 (UTC)Reply
I don't know if changes have been made in the hour or so since Binarystep Benwing posted, but the kyūjitai, kana, and kanjitab all appear blue to me. I'm using Firefox web browser on a Windows 11 computer, if that's any clue. Cnilep (talk) 06:20, 14 May 2024 (UTC)Reply
Still orange for me. I'm using Chrome on Windows 10, if that's relevant. Binarystep (talk) 06:23, 14 May 2024 (UTC)Reply
@Cnilep They appear orange to me, too, after a couple of seconds for the OrangeLinks gadget to do its thing. Do you have that gadget enabled? Benwing2 (talk) 06:51, 14 May 2024 (UTC)Reply
Oh, no, sorry, my mistake. I don't have the gadget turned on. Cnilep (talk) 07:01, 14 May 2024 (UTC)Reply

Inflections wrongly doubled[edit]

Hi, In the page ドラマティック, I have noted that the romaji transcription of the inflections wrongly duplicate the inflections ("doramatikku na na" and "doramatikku ni ni", with twice "na" and twice "ni"), even though the Japanese forms are correct (ドラマティックな and ドラマティックに, with a single な [na] and a single に [ni]). The problem seems to come from the template {{ja-adj|ドラマティック|infl=na}}. Can somebody please check? Thank you in advance. SenseiAC (talk) 12:27, 13 May 2024 (UTC)Reply

@Theknightwho Could you take a look? This is caused by this diff [9] you made last December. Benwing2 (talk) 20:55, 13 May 2024 (UTC)Reply

Check for blacklisted images we're actually using[edit]

A while ago, a troll was adding 'bad' images to entries, so we (someone) copied Wikipedia's image blacklist over to MediaWiki:Bad image list. This included images that don't exist (deleted but never cleared out of WP's list; I cleared them out of ours), redirects (where if the troll wanted to use the image, they could, because only the redirect is on the list🙃), and images we're validly using which blacklisting broke. By chance, I've happened upon and been able to un-blacklist two (the nipple at nipple and swastika at Nazism); can someone check whether any other images on the "bad image list" are actually in use and thus need to be unblacklisted? - -sche (discuss) 06:56, 14 May 2024 (UTC)Reply

@-sche File:Human buttocks.jpg is used on buttock. Binarystep (talk) 12:23, 14 May 2024 (UTC)Reply
Anyone else find that image on buttock a bit glaring? I think the one at nipple is somewhat more tasteful. Benwing2 (talk) 06:46, 15 May 2024 (UTC)Reply
Heads up that I replaced the latter with a crop. I don't think it changes your point, tho. —Justin (koavf)TCM 06:57, 15 May 2024 (UTC)Reply

Template:Latn-def requires lowercase letter[edit]

{{Cyrl-def|ru|name|А}} and {{Latn-def|en|name|A}} result in “The name of the Cyrillic script letter А.” and “The name of the Latin-script letter A/[[{{{4}}}#English|{{{4}}}]].”, respectively, because {{Latn-def}} requires specifying the lowercase letter, whereas {{Cyrl-def}} does not; is this intentional? I noticed this issue in the Russian entry for И. J3133 (talk) 07:37, 14 May 2024 (UTC)Reply

@J3133 It looks like User:Theknightwho did some hacking on the Cyrillic one in 2022, and part of that was changing it to not have a |onecase= flag to signal whether there's a lowercase equivalent, but rather checking the value itself. We should probably do the same for {{Latn-def}}. Benwing2 (talk) 06:48, 15 May 2024 (UTC)Reply

Exporting Wiktionary's templates to another wiki[edit]

So I have my own local MediaWiki installation and I have a bit of a "dictionary" on it where I store words from my English idiolect and also my conlang. So far, it uses "approximations" of Wiktionary templates, not the full templates themselves, as I once tried to import all of Wiktionary's templates, but ran into issues, as the wiki itself has significant amounts of imported Wikipedia templates, and thus name conflicts arose.

Is there a page or guide somewhere about setting up your own Wiktionary? A diehard editor (talk) 02:40, 15 May 2024 (UTC)Reply

@A diehard editor Almost certainly not. I've never heard of anyone doing this. Benwing2 (talk) 06:43, 15 May 2024 (UTC)Reply
Well, I'm out of luck.
Anyways...
The only "Wiktionary for conlangs"-type site I've encountered looks to be Linguifex's Contionary, but they too take the same approach of approximating Wiktionary templates with markup - see Attian ethnema ("language"). Another entry even had a manually coded inflection table.
As for Wiktionary - adding support for another conlang should hopefully be easy. I know a little bit of lua, and when it comes to conlangs, Wiktionary's done it for Esperanto so it stands to reason I can do the same for my own conlang on my local instance.
I'm thinking I could possibly set up another MediaWiki instance and have the dictionary stuff over there. Maybe even make a "MediaWiki Conlang Starter Kit" that only has English in it and leave it up to the user to write their own Lua. Who knows? A diehard editor (talk) 07:55, 15 May 2024 (UTC)Reply
Now that I think about it: the primary source of the difficulty is the immense amount of baggage Wiktionary templates have by having to support hundreds or thousands of natlangs whereas the typical conlanger would only need English support to start with and then a blank canvas to write Lua modules for their conlang. Perhaps conlangers are better off just writing the needed modules from scratch? A diehard editor (talk) 08:19, 15 May 2024 (UTC)Reply
With no disrespect to your project intended, I hope you understand that we already stretched technovolunteers may not be able to spare the time to help you set up a hobby wiki on your own machine.
Having said that, new Wiktionaries do start up from time to time; just this year the Karakalpak Wiktionary started. I wonder if these communities would find it useful if we did have a guidance page (either here or on Meta-Wiki) on how to set up a basic infrastructure of templates. Otherwise they are expending what are probably limited technovolunteer resources to reinvent the wheel. This, that and the other (talk) 10:26, 15 May 2024 (UTC)Reply
Do not worry, I do not expect any labor. I was only curious to see if there were already resources on how to do this. When I first installed MediaWiki back in February 2022, I did this with the knowledge that it'd be just me, the docs, and reverse-engineering Wikipedia templates that got me through.
I will forge ahead with my project by myself; implementing templates as needed. A diehard editor (talk) 15:21, 15 May 2024 (UTC)Reply