Wiktionary talk:Votes/2017-07/Rename categories

From Wiktionary, the free dictionary
Latest comment: 6 years ago by Daniel Carrero in topic of / in
Jump to navigation Jump to search

Demonyms and ethnonyms[edit]

Why not take these to the same conclusion and give them "terms for"-type names? "Terms for people from a place" and "Terms for people of ethnic groups" or similar? —CodeCat 13:48, 24 July 2017 (UTC)Reply

It's OK if people prefer using "Terms for people from a place" and "Terms for people of ethnic groups" or similar, but I'd rather use the shortest normal English-sounding names available. Category:English demonyms and Category:English ethnonyms are fine. They are "lexical categories", like Category:English nouns. --Daniel Carrero (talk) 14:19, 24 July 2017 (UTC)Reply
Should they be, though? It seems inconsistent with how we treat terms for people in general. Compare Category:Occupations. Being a demonym says more about the referent of the word than about the word itself. —CodeCat 14:22, 24 July 2017 (UTC)Reply
Good point, Category:English demonyms and Category:English ethnonyms seem inconsistent with how we treat terms for people in general. But we do have Category:English surnames and Category:Ancient Greek patronymics.
I'd say the simple fact that "-nym" is in the category name serves as a suitable replacement for "terms" or "names" in the category name. Maybe Category:Ancient Greek patronymics should be renamed to Category:Ancient Greek patronyms.
Demonyms are for both natives and inhabitants (according to the entry demonym), so there's this possible name, but it's too long: Category:English terms for natives and inhabitants of places. --Daniel Carrero (talk) 14:34, 24 July 2017 (UTC)Reply
We could add the option to disallow the names Category:English demonyms and Category:English ethnonyms, but I'm not sure we should -- for the reasons above, I think "demonyms" and "ethnonyms" are better than the alternatives presented as of now, and the vote is quite long already. Then again, if conflicting naming systems exist (some different proposals were presented in the BP discussion), it may be natural to create multiple options.
Aside from that, as you know, you may vote "support everything except oppose Category:English demonyms and Category:English ethnonyms" or something if that's consistent with what you think. Or just support everything, or whatever you think is best. --Daniel Carrero (talk) 20:20, 24 July 2017 (UTC)Reply
If we go with ethnonyms and demonyms, should our current Category:Places be renamed to "toponyms" as well? Or should we go with "place names" or "names of places"? —CodeCat 16:19, 6 August 2017 (UTC)Reply
I prefer Category:English place names, because it sounds more like normal, commonly used English to me than Category:English toponyms. It's also shorter than Category:English names of places. --Daniel Carrero (talk) 01:48, 7 August 2017 (UTC)Reply

Where to put the rules[edit]

Irrespective of the proposal's merit, I would firmly oppose a vote like this without some sort of confirmation, in the vote text itself, that these rules will be written down in a policy page. This, that and the other (talk) 10:30, 25 July 2017 (UTC)Reply

@This, that and the other: I think you're right. I edited the vote now and added that kind of confirmation, which I think was an improvement. Do you think it looks better now? --Daniel Carrero (talk) 10:47, 25 July 2017 (UTC)Reply

Transition plan[edit]

I would like to see some sort of plan for how this large-scale reorganization will actually be implemented, especially for when a category may need to be merged or split. DTLHS (talk) 20:07, 26 July 2017 (UTC)Reply

@DTLHS: What about moving one large set at a time? For example, start by moving the place name categories and forbidding any other category move until all place name categories are done. Then, moving all the "names of" categories (names of languages, names of individuals, names of stars). Then moving all the categories for species of animals. If you like that idea, I can write a more detailed plan like this. --Daniel Carrero (talk) 20:11, 26 July 2017 (UTC)Reply
Is that really necessary? We could handle them in a more ad-hoc per-category manner. —CodeCat 20:33, 26 July 2017 (UTC)Reply
I actually prefer the more ad-hoc per-category manner too. --Daniel Carrero (talk) 20:54, 26 July 2017 (UTC)Reply
We'll know it's done when Category:All topics is empty. —CodeCat 21:11, 26 July 2017 (UTC)Reply
Yes, and when Category:topic cat is empty too. --Daniel Carrero (talk) 21:17, 26 July 2017 (UTC)Reply
Perhaps we should already start working on making new data modules for these categories. —CodeCat 21:19, 26 July 2017 (UTC)Reply
That looks like a good idea. As you know, we can't say for sure whether the vote will pass or fail yet, but maybe some people would be more inclined to vote support if they see some work being done, I guess. It helps to let us know how the modules would look like if this vote passes. --Daniel Carrero (talk) 21:29, 26 July 2017 (UTC)Reply
I've created Module:User:CodeCat/place names. It contains most of the content of Module:category tree/topic cat/data/Places, but not all of it. I've also done a bunch of work to make descriptions consistent and such. —CodeCat 17:58, 6 August 2017 (UTC)Reply
Thank you, that's nice. But can't we compress the repeated information in a way to occupy less space in a predictable manner?
If possible, I'd like to avoid having repeated descriptions and categorization for lots of regions of the same type, especially those in the same countries.
Some options:
  1. Allow any combination of place name categories if they seem valid, even if they don't make sense in real life. Apparently Canada has provinces and territories, not states, but under this option Category:English names of states of Canada would be recognized as a valid category. Obviously, all categories for actual, real life place names would be recognized automatically too, and ideally anything else shouldn't be created, and may be deleted manually if needed.
  2. We could have a data module with the allowed types of place names of each country as per Wiktionary:Place names (some countries have provinces, prefectures, departments, etc.) and stop at that without checking the actual names of each country subdivision. So if there aren't any "prefectures" in USA, then this would be invalid: Category:English names of prefectures of Alabama, USA. But USA does have counties, so this category with a nonsensical state name would be valid: Category:English names of counties of Mxyzptlk, USA. We could use this type of list to specify that "counties" is a "level 2" subdivision in USA, so we couldn't have categories with an excess of commas like Category:English names of counties of A, B, C, D, E, USA even if just Category:English names of counties of A, USA is allowed. Obviously, we can have the simple Category:English names of counties of USA (without the state) because it could have all these subcategories, and 0 entries:
  3. We could have a large data module listing all the actual place names, with the description and categorization added automatically, not repeated in each instance. Say, Category:English names of municipalities in Minas Gerais, Brazil is a valid place name, these municipalities exist in real life, and the places could be checked in the data module. Maybe we could use Wikidata for that somehow (instead of a data module), but I'm not sure if we could search for "Minas Gerais" (d:Q39109) and "Brazil" (d:Q155) using only their names (not their item IDs) and arrive at the correct data items to check.
--Daniel Carrero (talk) 02:28, 7 August 2017 (UTC)Reply
That's an implementation detail and I'd rather avoid too much technical fanciness for now. I'm also not a fan of the idea of potentially allowing invalid names. —CodeCat 10:52, 7 August 2017 (UTC)Reply

"by language"[edit]

If this vote passes, it might be a good idea to remove "by language" from all lexical categories. The "by language" was added in Wiktionary:Votes/2011-04/Lexical categories to distinguish say, Category:Nouns (a category for words like common noun and mass noun) from Category:Nouns by language (a category for nouns like dog and book).

--Daniel Carrero (talk) 21:41, 26 July 2017 (UTC)Reply

Maybe, but I'd like to see a full list of any collisions that would happen. —CodeCat 22:00, 26 July 2017 (UTC)Reply
Correct me if I'm mistaken, but if this vote passes and in addition "by language" is removed from all categories, no collisions should happen at all. Examples:
--Daniel Carrero (talk) 22:51, 26 July 2017 (UTC)Reply
Reality is always different from what you expect. —CodeCat 22:52, 26 July 2017 (UTC)Reply
Alright, but can you tell me one example of collision that might happen? --Daniel Carrero (talk) 22:53, 26 July 2017 (UTC)Reply
The following categories would need to be dealt with:

DTLHS (talk) 23:14, 26 July 2017 (UTC)Reply

Thank you, DTLHS. I rest my case. :P —CodeCat 23:16, 26 July 2017 (UTC)Reply
Thank you for the list. I retract my statement: "no collisions should happen at all". --Daniel Carrero (talk) 23:17, 26 July 2017 (UTC)Reply
I've been thinking... Because of these collisions, maybe we should add "by language" to all the categories when applicable? It would also make it clear the exact purpose of the "by language" categories. Obviously, it would work like this:
--Daniel Carrero (talk) 16:44, 27 July 2017 (UTC)Reply
Probably better, yes. The names without it are cleaner, but also more troublesome. —CodeCat 16:55, 27 July 2017 (UTC)Reply
I added the "by language" in the vote text. --Daniel Carrero (talk) 16:59, 27 July 2017 (UTC)Reply

Parallel proposal: place topical categories next to the senses they apply to[edit]

Right now we place them at the bottom, far away from where they are relevant. I think it would be a good idea if we placed them next to the sense that they applied to. This would have no effect on how entries look, of course, but it would help editors and encourage them to add the appropriate categories for specific senses. It would perhaps also discourage them from adding inappropriate context labels. —CodeCat 10:49, 2 August 2017 (UTC)Reply

I don't see why not. This could be a good idea. --Daniel Carrero (talk) 15:03, 3 August 2017 (UTC)Reply
How do you think we could go about implementing it? —CodeCat 15:07, 3 August 2017 (UTC)Reply
Maybe we could have some labels that are invisible by default. For example, if the user types {{lb|en|Beech family plant}}, it would categorize entries in Category:en:Beech family plants (old name) or Category:English terms for Beech family plants (proposed name) but without generating a label. --Daniel Carrero (talk) 20:25, 4 August 2017 (UTC)Reply
What if someone puts in {{lb|en|Star}}? Without qualifying, we'd be back to the old problem. If we just put in the category directly (using {{topic}} or what would be its successor, {{cln}}), then we can decouple labels from categories. Perhaps we can even eventually remove the categorisation of some labels entirely. —CodeCat 20:42, 4 August 2017 (UTC)Reply
==English==

===Noun===
{{en-noun}}

# sense 1
{{cln|some category}}
# sense 2
{{cln|some other category}}
==English==

===Noun===
{{en-noun}}

# {{cln|some category}} sense 1
# {{cln|some other category}} sense 2
Alright, maybe we could place {{cln}} below the sense line. I typed a simple example in this right-floating box. Is that consistent with how you envisioned it? --Daniel Carrero (talk) 21:07, 4 August 2017 (UTC)Reply
The lines would start with #:. DTLHS (talk) 21:08, 4 August 2017 (UTC)Reply
Why not place them before the sense, though? We already place context labels there, as well as senseids. —CodeCat 21:11, 4 August 2017 (UTC)Reply
OK, I added a second example below the first. This should be what you described. If it is, it looks good to me too. --Daniel Carrero (talk) 21:21, 4 August 2017 (UTC)Reply
Yes, this is what I had in mind. As for relative ordering: senseid first, then semantic categories, then context labels. —CodeCat 21:25, 4 August 2017 (UTC)Reply
Why not make them visible? The template could work the same way as {{syn}}. DTLHS (talk) 18:06, 3 August 2017 (UTC)Reply
@DTLHS: I would rather not let them be visible like {{syn}}. I fear these categories that are not context labels (say, Category:en:Trees) might distract readers from reading the definitions. --Daniel Carrero (talk) 05:20, 7 August 2017 (UTC)Reply

Relative categorization[edit]

The vote doesn't currently say anything about how these new categories would be categorized relative to one another. I don't know if the vote needs to say this, but it would be good to work out in advance.

In principle, it would make sense for "terms for" categories to have "terms related to" categories as their parents. However, this runs the risk that many of the "related to" categories will stay empty and only act as meaningless parent categories. An example that came to mind is that of Category:Reindeers. This is currently relatively empty but at least it could be potentially filled up if someone was willing to take up the task. Languages such as Northern Sami have a lot of reindeer-related vocabulary as well, so there would also be a "terms related to reindeer" category for them. The problem is when we put Category:Northern Sami terms for reindeer species (I'm assuming that species categories will be named this way, the vote is actually missing them) into Category:Northern Sami terms related to reindeer. This categorization makes sense for Northern Sami, but obviously not for Arabic which presumably has zero reindeer-related vocabulary outside of species names. In fact, the vast majority of languages around the world have no reindeer-related terms, only species names. So their "terms related to reindeer" categories will all be empty and exist only for the sake of the few languages like Northern Sami that have that kind of vocabulary. —CodeCat 18:02, 3 August 2017 (UTC)Reply

I'll type that category tree here:
I agree that in principle, it would make sense for "terms for" categories to have "terms related to" categories as their parents.
I can't speak for everyone, but personally I'm completely fine with having that empty or almost empty Arabic category ("Arabic terms related to reindeer") as long as the Northern Sami version of the same category has a lot of reindeer-related terms. I assume it would help navigation, by having the same tree for semantical categories (I'm thinking, maybe I'll stop saying "topical categories") of all languages. Don't we already have a lot of empty categories like these which serve the purpose of containing other categories to match the same tree in other languages?
I noticed that all current "reindeers" categories have only one entry per language. OK, I'm sure we could have a lot of reindeer-related terms in Northern Sami as you said. I would suggest deleting this whole category tree if it were to contain just one entry per category forever.
--Daniel Carrero (talk) 20:57, 4 August 2017 (UTC)Reply
I'm not a fan of creating useless categories. Perhaps we should introduce a new concept, that of "related" category. Such categories would link to each other directly via something in their description, rather than via subcategorisation. That way the categories can exist, or not exist, independently of each other. —CodeCat 21:06, 4 August 2017 (UTC)Reply
A bit off-topic: We've been discussing "terms for reindeer species", but it seems "terms for reindeer subspecies" would be accurate. Apparently reindeer is a single species itself and it has subspecies.
Does that idea of "related" categories means that even if both Category:English terms relating to reindeer and Category:English terms for reindeer subspecies exist, one can't be subcategory of the other because they would be linked through their descriptions? --Daniel Carrero (talk) 21:19, 4 August 2017 (UTC)Reply
Ugh. Do we really have to start remembering whether things are genus, family, order, phylum, species, subspecies and so on? I'm going to quit if that's the case. It's the most pointless part of taxonomy that I know of. I'm sure others will find it needless busywork as well.
The point of related categories is that the relationship isn't one of subcategorisation. If you want to subcategorise, then you don't need related categories. You know how currently, POS-type categories show a list of recognised subcategories with a description? Related categories would be like that, but listed separately (above or below). A category would not appear in both lists, that would be silly. —CodeCat 21:24, 4 August 2017 (UTC)Reply
I don't think we need to have words like genus, family, order, phylum, species, subspecies in category names, but I noticed that Category:English terms for reindeer species seems to include an inaccurate use of the word "species". I would be happy with an equivalent like Category:English terms for kinds of reindeer.
Aren't the reindeer categories a relationship of subcategorization? As I believe you implied, all terms for kinds of reindeer (say, Finnish forest reindeer) are terms relating to reindeer. Or maybe you are thinking of using the "related" categories for other categories? --Daniel Carrero (talk) 21:41, 4 August 2017 (UTC)Reply
Semantically you are correct. All terms for kinds of reindeer are also terms related to reindeer. But our categories don't have to reflect the semantic relationships exactly. We can do something different, such as related categories, for pragmatic reasons. There are other cases where this applies as well, I have come across many but can't think of any examples offhand. But if you look at the list of categories for an ancient language like Proto-Germanic, you might see categories that are horribly out of place for such a language, relating to concepts that the speakers of that language had no notion of. I prefer to categorise in such a way that it reflects, within reason, the understanding of its speakers.
"terms for kinds of reindeer" could possibly also contain kinds of reindeer that are not genetic variants. Think of possible words like "breeding reindeer, infertile reindeer, female reindeer", etc. —CodeCat 21:46, 4 August 2017 (UTC)Reply
If we decide that Category:Arabic terms for reindeer species should exist and Category:Arabic terms related to reindeer shouldn't, then how do we get to the existing category in the first place? Won't there be a number of semantical categories that are not in any category?
I guess Category:English terms for reindeer subspecies is the best name among those discussed for that category. To some extent, it's similar to Category:English terms for dog breeds ("breeds" instead of "subspecies"). We could still rename Category:en:Beech family plants to Category:English terms for Beech family plants; that is, the use of "species", "subspecies" could be allowed sometimes when needed for disambiguation, this time to avoid possible words like "breeding reindeer, infertile reindeer, female reindeer", etc. --Daniel Carrero (talk) 22:42, 4 August 2017 (UTC)Reply
The categories would still form a tree, and have at least one parent. Category:en:Reindeers currently has Category:en:Cervids as its parent, which is fine because all reindeer are cervids. The new structure should preserve this subtyping. The new "terms related to reindeer" meanwhile could be placed under something more generic such as "terms related to cervids", "terms related to mammals", or further up the tree. You'd end up with two parallel trees, with corresponding branches linked via the "related" mechanism. —CodeCat 22:56, 4 August 2017 (UTC)Reply
I'm not sure. I believe you mean this category tree below, but this does not seem very intuitive to me. Let me know if this is not consistent with what you proposed.
I think Category:Arabic terms for reindeer species really should be a subcategory of Category:Arabic terms related to reindeer. If an anon or new editor wants to create a category, this should be the most intuitive thing to do. If these two category trees really are kept separate, arguably Finnish forest reindeer should be part of both "reindeer" categories (which I'm not suggesting to do).
It does not even solve that problem: categories like Category:Arabic terms related to cervids could be completely or almost empty, and serve only to contain subcategories like Category:Arabic terms related to reindeer. --Daniel Carrero (talk) 23:54, 4 August 2017 (UTC)Reply

"relating to" / "related to"[edit]

Does any of these two options sound more like normal native English? I'm not a native speaker, I can't always tell. (Both sound equally good to me, and "terms relating to" and "terms related to" have a similar number of Google hits.)

--Daniel Carrero (talk) 20:35, 4 August 2017 (UTC)Reply

of / in[edit]

In Module:User:CodeCat/place names, I noticed that some place names use "of" and others use "in". Please check if they are all OK. If they are, the vote may need to be edited to say that all these possibilities are accepted. If there are any problems, they should be fixed. I believe this is the full list.

"names of ... of"
  • "names of autonomous communities of"
  • "names of autonomous oblasts of"
  • "names of autonomous okrugs of"
  • "names of autonomous regions of"
  • "names of autonomous republics of"
  • "names of cantons of"
  • "names of census-designated places of"
  • "names of counties of"
  • "names of countries of"
  • "names of departments of"
  • "names of districts of"
  • "names of districts and autonomous regions of"
  • "names of divisions of"
  • "names of federal cities of"
  • "names of krais of"
  • "names of oblasts of"
  • "names of prefectures of"
  • "names of provinces of"
  • "names of regencies of"
  • "names of regions of"
  • "names of republics of"
  • "names of special administrative regions of"
  • "names of state capitals of"
  • "names of states of"
  • "names of subdistricts of"
  • "names of subprefectures of"
  • "names of territories of"
"names of ... in"
  • "names of boroughs in"
  • "names of cities in"
  • "names of municipalities in"
  • "names of rivers in"
  • "names of special wards in"
  • "names of towns in"
  • "names of villages in"

Compare this, which seems to be inconsistent: "names of federal cities of" / "names of cities in". --Daniel Carrero (talk) 11:57, 12 August 2017 (UTC)Reply

I would say that "in" refers to geographical location, while "of" refers to administrative division. Another way to think of it might be that with "in", you're saying that the thing exists even if the political structure around it were to disappear, while with "of" it would not be the case. —CodeCat 12:02, 12 August 2017 (UTC)Reply
Then, in the list above as currently shown, it seems only "names of rivers in" should keep the "in" at the end. All other names would have "of", including: "names of boroughs of", "names of cities of", "names of villages of", etc. We could also add (if needed) these other "in" places: "names of mountains in", "names of lakes in", "names of volcanoes in". Does that sound OK to you? --Daniel Carrero (talk) 12:07, 12 August 2017 (UTC)Reply
What do you mean? Cities don't vanish when the country does. They are physical entities, not abstract. Municipalities could become "of" though, same with special wards. —CodeCat 12:44, 12 August 2017 (UTC)Reply
@CodeCat: Fair enough. I edited the vote to allow the use of both "in" and "of". --Daniel Carrero (talk) 14:51, 16 August 2017 (UTC)Reply