Wiktionary:Votes/2012-08/Extinct Languages - Criteria for Inclusion

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Extinct Languages - Criteria for Inclusion[edit]

  • Voting on: Changes to the Criteria for inclusion and accompanying pages to allow words in extinct languages to be included on Wiktionary based on a single mention, provided a citation is given.

Specific change (changes marked in colour):

From

Number of citations

For languages well documented on the Internet, three citations in which a term is used is the minimum number for inclusion in Wiktionary. For terms in extinct languages, one use in a contemporaneous source is the minimum. For all other spoken languages, only one use or mention is adequate, subject to the following requirements:

  • the community of editors for that language should maintain a list of materials deemed appropriate as the sole source for entries based on a single mention,
  • each entry should have its source(s) listed on the entry or citation page, and
  • a box explaining that a low number of citations were used should be included on the entry page (such as by using the {{LDL}} template).[1]

to

Number of citations

For languages well documented on the Internet, three citations in which a term is used is the minimum number for inclusion in Wiktionary. For terms in extinct languages, one use in a contemporaneous source is the minimum, or one mention is adequate subject to the below requirements. For all other spoken languages that are living, only one use or mention is adequate, subject to the following requirements:

  • the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention,
  • each entry should have its source(s) listed on the entry or citation page, and
  • a box explaining that a low number of citations were used should be included on the entry page (such as by using the {{LDL}} template).[2]


  • Vote starts: 00:01, 18 August 2012 (UTC)
  • Vote ends: 23:59, 16 September 2012 (UTC)

Support[edit]

  1. Support Liliana 14:32, 18 August 2012 (UTC)[reply]
  2. SupportCodeCat 15:19, 18 August 2012 (UTC)[reply]
  3. Support BB12 (talk) 16:22, 18 August 2012 (UTC)[reply]
  4. Support --Μετάknowledgediscuss/deeds 03:09, 19 August 2012 (UTC)[reply]
  5. Support --Anatoli (обсудить) 22:34, 20 August 2012 (UTC)[reply]
  6. Support As a rule, I tend to support policies that conform with my practice, and the Ancient Greek community has long deemed it desirable to occasionally create entries for words which are based solely on mentions. If this were a flat support of "mentions are always ok", then I would find it problematic, but the requirement that the language community spell out how it is to be used allows the infusion of good sense and context into the decisions. -Atelaes λάλει ἐμοί 23:41, 20 August 2012 (UTC)[reply]
  7. Support —Stephen (Talk) 01:32, 21 August 2012 (UTC)[reply]
  8. Support --Vahag (talk) 09:26, 21 August 2012 (UTC)[reply]
  9. Support. — Ungoliant (Falai) 15:39, 26 August 2012 (UTC)[reply]

Oppose[edit]

  1. Oppose. Specifically oppose allowing mentions instead of uses, see talk page. Mglovesfun (talk) 15:21, 18 August 2012 (UTC)[reply]
  2. Oppose Dan Polansky (talk) 23:06, 18 August 2012 (UTC) I don't believe using mentions as evidence of existence of words in actual use is advisable. Furthermore, this vote neither states the proposer's rationale nor provides a link to it. I find a rationale neither on the vote page nor on the vote talk page. The rationale could possibly be found in one of the six linked discussions pages, but this vote does not tell me where to find it. Finally, one language mentioned in relation to attestation via mentions was Dacian. It was mentioned in Wiktionary_talk:Votes/pl-2011-05/Attestation_of_extinct_languages. Judging from W:Dacian language, 3rd paragraph ("... In ancient literary sources, the Dacian names of a number of medicinal plants and herbs survive in ancient texts[7][8] this includes about 60 plant names with Dioscorides.[9] Dacian is also known through about 1,150 proper names[6][10] and about 900 toponyms.[6] ..."), almost nothing is known about Dacian words, so Dacian cannot be significantly included in Wiktionary anyway. --Dan Polansky (talk) 23:06, 18 August 2012 (UTC)[reply]
    Allowing the inclusion of Dacian is a critical issue. Without doing so, the primary goal of Wiktionary to have all words in all languages cannot be reached. Other reasons can be given, such as those found on the links, but I do not believe they will sway anyone who is opposed from the get-go :) --BB12 (talk) 23:42, 18 August 2012 (UTC)[reply]
    We cannot have all words in all languages. We can only have those words for whose existence there is sufficient evidence. Above all, we cannot have words that have existed but for which there is no evidence whatsoever, not even in the form of mentions, as is the case with the overwhelming majority of Dacian words. You seem to think that mentions are good enough evidence; I don't. And yes, I love the card of "other reasons can be given, they can be found if someone bothers to dig them up, but I the proposer could not care less to do the digging myself". --Dan Polansky (talk) 09:25, 19 August 2012 (UTC)[reply]
    Alright, Dan. Propose a vote that we remove "all words in all languages" from CFI, and see what happens. In the mean time, it's really disingenuous and rude to grossly modify what somebody has said within quote marks to their face, and I think you know that whatever we pull up to convince you, you will remain stubborn and engage us in the usual few paragraphs of ideological opposition. --Μετάknowledgediscuss/deeds 16:11, 19 August 2012 (UTC)[reply]
    I repeat: Above all, we cannot have words that have existed but for which there is no evidence whatsoever, not even in the form of mentions, as is the case with the overwhelming majority of Dacian words. --Dan Polansky (talk) 17:28, 19 August 2012 (UTC)[reply]
    If I may, I think Dan's point is all word in all languages would included totally unattested ones. It's a silly argument based on the statement above "Allowing the inclusion of Dacian is a critical issue. Without doing so, the primary goal of Wiktionary to have all words in all languages cannot be reached". This can probably never be reached under any circumstances. Someone did a rough calculation and reckoned we might need as many as six billion words. Even if this does happen, i9t'll be long after we've all died. Mglovesfun (talk) 17:36, 19 August 2012 (UTC)[reply]
    Just because a dream seems impossible doesn't mean we shouldn't try. It's not all or nothing. There are probably legitimate Dacian words lost to time, but if there are some that aren't, that we have knowledge of yet refuse to allow the inclusion of — that's not fulfilling the purpose of Wiktionary. Who made that estimate? It seems way off to me, considering that most languages don't have very large vocabularies, but I'd like to see the math that went into it. --Μετάknowledgediscuss/deeds 18:05, 19 August 2012 (UTC)[reply]
    I feel obligated to respond, but all I can do is say MK has said it well in both responses above. --BB12 (talk) 18:17, 19 August 2012 (UTC)[reply]
    But Dan's points still stands, right? Mglovesfun (talk) 19:00, 19 August 2012 (UTC)[reply]
    His point makes no sense to me, so I did not respond. Of course nobody is going to make up words and claim they're Dacian. We have safeguards for that, including the fact that the language community can choose to exclude certain sources. --Μετάknowledgediscuss/deeds 19:07, 19 August 2012 (UTC)[reply]
    His point makes perfect sense. If I can understand it, you can too. Mglovesfun (talk) 19:16, 19 August 2012 (UTC)[reply]
    If there is a text or carving from 2,000 years ago saying word X is a word in Dacian and it means Y, then it is part of our collective record of language use, and I think it should be included in Wiktionary as part of human language. Dan Polansky says, "You seem to think that mentions are good enough evidence; I don't." Holding discussions, declaring opinions and voting on policy is why we have these proposals. Without Dan Polansky's kind feedback here and here and probably other places, my three follow-up proposals, including this one, would have been sorely lacking. I am grateful for his advice. Ultimately, we Wiktionarians disagree in how to judge what constitutes worthy evidence, and I don't think reconciliation is in the offing. We have to simply respect each other's opinions and make Wiktionary as good as we can. :) --BB12 (talk) 19:48, 19 August 2012 (UTC)[reply]
    ──────────────────────────────────────────────────────────────────────────────────────────────────── Re: "If there is a text or carving from 2,000 years ago saying word X is a word in Dacian and it means Y, then it is part of our collective record of language use, ...": A text that says word X is a word in Dacian an it means Y" is not part of collective record of use of Dacian. That text, even if authentic, may easily be wrong about Dacian, spreading a myth about Dacian. By your reasoning, we might as well include all dictionary-only English words in the mainspace, in order to include "all words in all languages". After all, "dictionary-only words" are words almost by definition, right? And so are words only attested on the web but not in durably archived sources. If you want to include as many words as you can at all costs, you have to relax your inclusion criteria to a maximum possible extent. I would much sooner see Ungoliant and Sauron included as terms that have actually been used and are certainly pronounced by Tolkien fans than mention-only would-be words. By the way, thanks for the links to past discussions: Wiktionary_talk:Votes/2012-04/Languages_with_limited_documentation#Extinct_languages_and_mention. --Dan Polansky (talk) 20:57, 20 August 2012 (UTC)[reply]
    Good point. I meant to say that; the thing with mentions where there are no uses is we don't know how reliable they are. We're pinning all our hopes on whoever wrote the mention getting it right, with no way of checking. Mglovesfun (talk) 20:59, 20 August 2012 (UTC)[reply]
    I have no interest in discussing Tolkien; that is a different topic.
    Because Dacian is in the written record, it is a part of our collective history of language and is therefore worthy of inclusion. The disclaimer that this proposal requires is intended to help the user that it might be in error. (Note that Latin and Ancient Greek words with only one or two uses are also likely to be erroneous, yet we allow those.) I invited you, Mglovesfun, to discuss an alternative for Dacian on the talk page, but you did not respond, so I let the proposal go to vote. It's too bad that this opposition could not have been raised on the talk page before the proposal went to vote, but not now that the vote is active, I think the issues have been adequately discussed, so I will probably not comment further on this thread :) --BB12 (talk) 22:07, 20 August 2012 (UTC)[reply]
    I think my position's pretty clear. I don't feel the need to add more. Mglovesfun (talk) 22:09, 20 August 2012 (UTC)[reply]
    There is one situation where a mention would definitely be reliable: if it were written by a speaker of that language. None of the many European scribes who wrote in Latin were raised speaking Latin, of course. So if a German scribe inserted a word in Old High German (which he himself spoke) into a Latin text, then that should probably be considered a reliable attestation. We may not always be able to ascertain who wrote a text and what their command of other languages was, but if it's known that the writer of a Latin legal document which contains a word in a local language not attested elsewhere, also wrote a longer text in that same language, then I don't see why that single word should not be considered reliably attested. —CodeCat 22:20, 20 August 2012 (UTC)[reply]
  3. Oppose for two reasons. Reason the first: the proposed text implies that the Proto-Indo-European editors will be able to name some experts whose mentions of reconstructed forms count for attestation. I think that's terrible. Reason the second: Μετάknowledge and BB12's shocking responses to Dan Polansky's points suggest that they would actually support this construal of the proposed text, since such an approach is the only way to fulfill our goal of including all words of Proto-Indo-European. —RuakhTALK 19:50, 21 August 2012 (UTC)[reply]
    We have policy that states that reconstructed terms are not allowed in the mainspace, and I agree with it. Reconstructed languages are not under discussion here, and I don't think that this proposal contradicts that. if you want us to make that viewpoint (which I'm sure BB shares) explicit, that can be easily done by consensus in the BP. --Μετάknowledgediscuss/deeds 20:15, 21 August 2012 (UTC)[reply]
    Reconstructed forms are limited to proto languages. The text doesn't say anything about contemporary mentions either, so a modern citation of "there is a hypothetical Vulgar Latin word montanea" would be perfectly valid to create an entry for montanea, provided the source of the mention was deemed acceptable. Mglovesfun (talk) 20:22, 21 August 2012 (UTC)[reply]
    Reconstructed languages are completely unaffected by this vote. See Wiktionary:CFI#Reconstructed_languages, a policy which will remain regardless of the outcome of this vote. --BB12 (talk) 22:35, 21 August 2012 (UTC)[reply]
    @MG: Except that it wouldn't happen, because Latin editors like me would notice it and choose not to allow that source for mentions. --Μετάknowledgediscuss/deeds 23:55, 21 August 2012 (UTC)[reply]
    • Oppose. I'm not sure about the first green-font addition, but the second makes the CFI say that extinct languages are living, and I can't support that.​—msh210 (talk) 05:27, 11 September 2012 (UTC) ​—msh210 (talk) 15:51, 11 September 2012 (UTC)[reply]
      I can't see how it can be interpreted that way. "For all other spoken languages that are living" can be rephrased as "with respect to all other living spoken languages." --BB12 (talk) 07:13, 11 September 2012 (UTC)[reply]
      @msh210: If the overall construction were "For terms in extinct languages" vs. "For all other spoken languages that are living", then yes, that would imply that at least some extinct languages are living; but the overall construction is actually "For languages well documented on the Internet" vs. "For terms in extinct languages" vs. "For all other spoken languages that are living". (It's a bit confusing, I admit, and how come only the second one has "terms"?, but I don't think it has to be read as saying what you say it says.) —RuakhTALK 12:06, 11 September 2012 (UTC)[reply]
      I shouldn't vote on things when very tired. (More precisely, I shouldn't vote on things and express my reasoning when very tired. If I am being foolish but don't express my reasoning, then, at least, no one will catch me at it.) Striking this vote by indenting, and indenting all the comments beneath it to match.​—msh210 (talk) 15:51, 11 September 2012 (UTC)[reply]

Abstain[edit]

Decision[edit]

Passes 9-3-0 (75%). I will modify CFI accordingly. --Μετάknowledgediscuss/deeds 05:16, 17 September 2012 (UTC)[reply]