Module talk:utilities

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Sorting Korean hanja terms[edit]

Can someone please take a look at 公言#Korean or any other Korean hanja term? They are added to Category:Sort key tracking/needed for some reason.

We (myself and Haplology (talkcontribs) agreed that Korean words sort themselves OK (i.e. by their jamo characters, components of hangeul characters) when they are written in the native script hangeul (they don't need hidx= parameter) but hanja (Chinese character) entries need to be sorted by hangeul, e.g. 公言 should be sorted by 공언, which we are currently adding if they are missing.

There's a current discussion at Module talk:ko-headword. --Anatoli (обсудить/вклад) 05:20, 13 January 2014 (UTC)Reply

Category:Sort key tracking/needed is one of CodeCat's categories; they're annoying, but I think it's best to just try to ignore them. —RuakhTALK 06:43, 13 January 2014 (UTC)Reply
Thanks. Some of her categories are useful, not sure about this case. Anyway, we need to establish if, in this case these terms are going to be sorted as expected. There seems to be a database delay currently but the term 合成 should appear (sorted) under (h), the first jamo character in 합성 (hapseong). 合成 currently shows separately, at the beginning of the list. --Anatoli (обсудить/вклад) 07:05, 13 January 2014 (UTC)Reply
I have no idea what happened, but I copy-pasted the hangeul in the def line up into the headword and replaced what looks to my eyes to be identical hangeul (‎합성 > 합성), and it sorts correctly now. It's 3 bytes(?) smaller now, somehow. Weird... In an effort to fix the sorting problem I edited ko-headword and redefined the hanja pattern as "not hanja or a space", as opposed to simply the range of Han characters, but it should work for Korean. It's easy to change that if it's wrong.
Whatever ‎합성 is, it should work too. This proves that there is a larger problem with something, what I don't know. Haplogy () 12:11, 13 January 2014 (UTC)Reply
The original purpose of the category was to be able to remove sort keys when they're not needed anymore, but also to be able to find languages that really need sort key rules but don't have them yet (or have inadequate ones). We also have categories like that for transliterations (which I didn't create). —CodeCat 14:51, 13 January 2014 (UTC)Reply
If the category is still important, please tell us what needs to be done, if anything is wrong. It seems that the expected sorting of Korean hanja entries is working now. If it's no longer used, could you take it out, please?
Details, in case I didn't explain clearly:
  • Entry 合成 sorted by hangeul (parameter hangeul=) 합성 (hapseong), appears listed under (h), the first jamo of the hangeul 합.
  • If a hanja entry is sorted by the first Chinese character (hanja) instead (合 in this case, then sorting is not working or "hangeul" value is not provided. All instances of, e.g. "hidx=ㅎ" or "hidx=합성" are to be replaced with "hangeul=합성". --Anatoli (обсудить/вклад) 01:04, 14 January 2014 (UTC)Reply
I see the "problem" I think. The template is using the hangeul spelling as the sort key, which isn't what the module generates by default, so it says "this sort key isn't something I can make on my own, so it's needed". I made a change to Module:utilities so that it ignores the check for Korean altogether. —CodeCat 02:16, 14 January 2014 (UTC)Reply
Thank you! --Anatoli (обсудить/вклад) 02:28, 14 January 2014 (UTC)Reply

Lookup table[edit]

I think this short function would be useful if added to this module:

function export.make_lookup_table(str)
	local ret = {}
	for i in mw.ustring.gmatch(str, "%a+") do
		ret[i] = true
	end
	return ret
end

This way the giant 'if' that checks for removed languages would be a lot easier to maintain. I use this function in Module:pl-noun. --Tweenk (talk) 06:40, 14 April 2015 (UTC)Reply

Testing format_categories[edit]

@Erutuon, JohnC5 There have been a few occasions where I wanted to test that categories get generated properly (in another module calling format_categories). However categories don't get formatted in the Module namespace. Can you think of a good way to test this? Passing through force_output just for the test isn't very elegant. Maybe change format_categories so that it works in a test context? – Jberkel 12:35, 17 March 2018 (UTC)Reply

One hackish idea is to override mw.title.getCurrentTitle in the testcases module so that when it's called inside of format_categories, it returns a title with a different namespace. Would just have to make sure that this won't cause problems anywhere else. — Eru·tuon 06:31, 12 April 2019 (UTC)Reply
@Jberkel: Okay, see Module:utilities/testcases. The basic framework seems to work. I verified with mw.log(debug.traceback()) inside the usurping specialGetCurrentTitle function that it's only being called inside of format_categories. — Eru·tuon 06:50, 12 April 2019 (UTC)Reply
@Erutuon: Thanks! However, that was a year ago, now I forgot what I was trying to test :) – Jberkel 07:37, 12 April 2019 (UTC)Reply

Protection[edit]

Module:utilities/data should have the same protection level as Module:utilities. — Eru·tuon 17:57, 5 May 2019 (UTC)Reply

Done! —Rua (mew) 18:49, 5 May 2019 (UTC)Reply

Tagalog Baybayin text[edit]

I'm not sure if this is the right discussion page but on the Category:"Tagalog terms in Baybayin script", I can't see the text written even if I already have the font said here: Appendix:Baybayin_script. I can see Baybayin text elsewhere though. I thought adding Tagalog (code="tl", sc="Tglg") to this data may do something. Ysrael214 (talk) 13:30, 16 October 2022 (UTC)Reply

no_track?[edit]

Variable no_track on line 242 causes trouble..

Likely meant args.nocat but its not even declared local. Dpleibovitz (talk) 02:32, 1 November 2023 (UTC)Reply

Line 142 breaking[edit]

This line seems to be breaking things. I'm getting the following error when I'm looking for definitions: Lua error in Module:utilities at line 142: attempt to perform arithmetic on local 'h' (a nil value) Latin declensions also aren't showing anymore. ~RH9 03:02, 8 January 2024 (UTC)Reply

Criteria for when format_categories returns anything[edit]

@Benwing2, Erutuon, This, that and the other, Chuck Entz, JeffDoozan At the moment, format_categories will only return categories if the current page is in mainspace, or otherwise the Appendix, Thesaurus, Citations or Reconstruction namespaces. This is so that we don't get non-content pages being wrongly categorised in content categories. This can be overridden with the force_output parameter, which is mostly used for testing, but it's also used in Module:checkparams (and I'm sure other places too), since maintenance categories need to cover a wider array of pages.

Yesterday, I added Module:maintenance category, which checks whether a page should go in (e.g.) Category:Pages with module errors or Category:Pages with module errors/hidden, and I was thinking it might be a good idea to change force_output parameter so that the value "maintenance" means it uses those checks instead (but any other values like true or false would cause the same behaviour as now). That way, we can prevent the bad params categories from being flooded with junk, while still making sure they cast a wider net. Theknightwho (talk) 20:47, 10 April 2024 (UTC)Reply

@Theknightwho Sounds good to me. Benwing2 (talk) 22:03, 10 April 2024 (UTC)Reply
@Theknightwho Good with me, too. JeffDoozan (talk) 23:16, 10 April 2024 (UTC)Reply