Category talk:100 English basic words

From Wiktionary, the free dictionary
Latest comment: 13 years ago by Beobach972 in topic Additional discussion
Jump to navigation Jump to search

I use the word "most" much more often than I use the word "shall", but that's not how they're ranked in the Gutenberg texts. I suspect the language is changing enough that "should" is more common, and I think I use "most" about more often than that. Inevitably, there will be conflicts over what the 100 most common words actually are, in the English language; I'm not sure that Gutenberg got it completely right.

Galactiger 13:34, 27 August 2006 (UTC)Reply

Recommendations[edit]

These categories really need an explanation at the top. I would have had no idea that this was from analyzing Project Gutenberg books if there had not been the previous comment above. Also, why are there 101 words in this cat? --Cromwellt|Talk|Contribs 15:48, 28 September 2006 (UTC)Reply

Oh, and it depends on what the criteria are. If we are referring to the most common in print, that will be different from the most common in conversation, and both will be different from the most common in school books as well. The link at the top of this cat goes somewhere else, and I can't find the correct target. --Cromwellt|Talk|Contribs 16:09, 28 September 2006 (UTC)Reply

Confusing reference[edit]

Since Wiktionary seems to go by Project Gutenberg's list, it is rather confusing to have the link to Literary Trust's list featured prominently on this category page without any comment.

Also, the list appears to be too long now (101 words). __meco 10:14, 1 April 2007 (UTC)Reply

Literary Trust's list[edit]

Here's a alphabetized version of the LT's list. As you can see, the current contents of this category don't match it very closely...

I a about all an and are as at back be been before big but by call came can come could did do down first for from get go had has have he her here him his if in into is it just like little look made make me more much must my new no not now of off old on one only or other our out over right said see she so some that the their them then there they this to two up want was we well went were what when where which who will with you your

I suggested a feature be added to User:AutoFormat to ensure this and categories like it don't get modified; that'd help... JesseW 09:33, 7 February 2009 (UTC)Reply

Deletion debate[edit]

The following information passed a request for deletion.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is outdated, and there are not 100 words in it! Below is a copy of the discussion, from Wiktionary:Requests for cleanup, more than 2 years ago. I'd strongly favor this information in appendixes.

First of all, there are 101 words in there. Secondly, I often see a word that ranks somewhere over a Hundred in Gutenberg, but is in this category. Third, there are so much of those lists around, I do not know which one to choose. henne 17:09, 11 January 2007 (UTC)Reply

This was a list of words created and designated by THEM, and is not based on what words are most common. It's a "starter vocabulary", and the equivalents of these words are deemed to be a good starting point for a new Wiktionary project. --EncycloPetey 06:22, 14 January 2007 (UTC)Reply
My analysis of Project Gutenberg (as a corpus) has no relation to this person's project. I find them interesting in comparison to each other, as well as to the other Frequency lists we have.
Perhaps if I actually had compared them in earnest, I would have noticed (before now) that it links to a copyright site, that has a no-commercial reuse clause. So this should move from WT:RFC to WT:RFDO.  :-(
--Connel MacKenzie 06:26, 14 January 2007 (UTC)Reply
But I will not move it to RFDO myself, as that would possibly look like I'm favoring Project Gutenberg unfairly, or some-such. --Connel MacKenzie 08:32, 24 January 2007 (UTC)Reply
Delete as it appears the copyright on this list is not compatible with Wiktionary. Could we include it as an appendix if there are copyright problems? --Bequw¢τ 14:51, 12 November 2009 (UTC)Reply
Delete per above. Mglovesfun (talk) 18:14, 12 November 2009 (UTC)Reply
Are we talking about the UK National Literacy Trust list of 100 basic words? Do we think they are out to sue us for breaking copyright? There might be a reason for WMF (or us) to convince them to use one or more of the licenses that would make it automatic for us to keep it. It is safe to assume that they haven't copyright on the category name. Keep category, find or compile copyright acceptable list of basic words. Perhaps the vast technical resources of Simple English Wiktionary can help. Or we could borrow/link from them. DCDuring TALK 16:15, 13 November 2009 (UTC)Reply
It could be replaced with the top 100 Gutenburg words by frequency. We should remove copyvio's regardless of what we think the other organization will do. If they do allow it in the future it can always be added back. --Bequw¢τ 17:31, 14 November 2009 (UTC)Reply
The 100 most common words according to Gutenberg are mostly pronouns and prepositions. That list would not be too useful and we already have it as an appendix I believe. -- Prince Kassad 18:06, 14 November 2009 (UTC)Reply
Keep unless this is an indubitable copyvio. --Dan Polansky 21:04, 15 November 2009 (UTC)Reply
Keep per Dan Polansky. Razorflame 20:17, 16 November 2009 (UTC)Reply
Keep & RFC These words were added by the long-gone Conan (and his Bot). The original link on the category page was not to the UK site but to here (a cached copy of the now-dead original page). But oddly, the words in this category don't match up to either published list. So there's no copyright violation going on. It leaves us with the fact that we don't know how these words were choosen, which is why I think they should be sent for cleanup, not deletion. --Bequw¢τ 20:37, 17 November 2009 (UTC)Reply

Kept. Mglovesfun (talk) 13:14, 29 November 2009 (UTC)Reply

Additional discussion[edit]

Note this archive of RFC and RFD discussions. — Beobach 02:24, 15 December 2010 (UTC)Reply

RFC discussion: January 2007–December 2010[edit]

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


First of all, there are 101 words in there. Secondly, I often see a word that ranks somewhere over a Hundred in Gutenberg, but is in this category. Third, there are so much of those lists around, I do not know which one to choose. henne 17:09, 11 January 2007 (UTC)Reply

This was a list of words created and designated by THEM, and is not based on what words are most common. It's a "starter vocabulary", and the equivalents of these words are deemed to be a good starting point for a new Wiktionary project. --EncycloPetey 06:22, 14 January 2007 (UTC)Reply
My analysis of Project Gutenberg (as a corpus) has no relation to this person's project. I find them interesting in comparison to each other, as well as to the other Frequency lists we have.
Perhaps if I actually had compared them in earnest, I would have noticed (before now) that it links to a copyright site, that has a no-commercial reuse clause. So this should move from WT:RFC to WT:RFDO.  :-(
--Connel MacKenzie 06:26, 14 January 2007 (UTC)Reply
But I will not move it to RFDO myself, as that would possibly look like I'm favoring Project Gutenberg unfairly, or some-such. --Connel MacKenzie 08:32, 24 January 2007 (UTC)Reply
Duly copied to RFDO. — Beobach 07:32, 20 November 2010 (UTC)Reply
Reportedly previously discussed on RFDO and kept despite copyright concerns. — Beobach 09:01, 20 November 2010 (UTC)Reply
NB the RFD discussion. — Beobach 02:22, 15 December 2010 (UTC)Reply


RFM discussion: September 2016[edit]

See Category talk:Basic words by language#RFM discussion: September 2016.