Skip to 0 minutes and 5 seconds Oxford University Press conducted some research, drawing upon corpus research and expert views to compile a word list called the Oxford 3,000. Word lists have been used for a long time to guide and facilitate language learning. So this is nothing new. But with the advent of computer corpora, large electronically-stored samples of language– which applied linguists can analyse to find out important information about the language, and how it’s typically used in speech and writing– we now have a useful tool to refine our understanding of language and how it works. And this can potentially inform how we teach it. The Oxford 3,000 identifies the most important words to learn due to their frequency, range, and utility.
Skip to 0 minutes and 58 seconds And these words are included in the Oxford Essential Learner’s Dictionary. Of these 3,000 words, 2,000 words are considered key words in English and have a key icon next to them in the dictionary. For example, the following words are in the Oxford 3,000. Peach, receive, shirt, and widow. But only receive and shirt are key words. So let’s talk a little bit more about how these words were selected to appear in the Oxford 3,000. As I mentioned, there were three criteria which guided the selection process, frequency, range, and centrality. Frequency’s very important, but should not be the only criteria. We know if a word is frequent by looking at corpora.
Skip to 1 minute and 51 seconds A corpus, as was mentioned previously, is an electronically-held collection of written and spoken text. The British National Corpus contains over 100 million words taken from newspapers, television, and real-life conversations. The Oxford Corpus Collection is another corpus which contains over 2.5 billion words from the World Wide Web, including emails, blogs, and social media. And not all the examples are from British English. As you might expect, the words the, be, and of are the three most highly frequent words in this corpus. Being highly frequent is not enough for a word to qualify as a key word, however. It may be that a word is used frequently, but only in a narrowly-defined area. For example, in newspapers or scientific text.
Skip to 2 minutes and 52 seconds To be a key word in the Oxford 3,000, a word must be frequent across a range of text types. So the key words are frequent and used in a variety of contexts. Which brings us on to the next criterion, range. Some words are common in certain contexts, but not in other contexts. And words only appear in the Oxford 3,000 if they’re frequent in many contexts. That is, they have range. The third criterion is centrality. This ensures that relatively low frequency words, which are important– that is they have basic meanings, which cannot be communicated in other ways– are included. For example, Tuesday and Wednesday are not as frequent as Friday and Saturday.
Skip to 3 minutes and 44 seconds But it would be ludicrous not to teach them when we’re teaching learners the days of the week. Other examples of words which are less frequent, but important to teach because of their centrality– or importance in everyday life– might include parts of the body, words used in travel, and words which we commonly find together. For example, it makes sense to teach pepper when you teach salt, even though pepper is probably less frequent than salt.
The Oxford 3000
So we can see that it is almost impossible to give a definitive answer as to how many words there are in English, but how many of them do we actually use?
Research in Applied Linguistics can help course book writers refine their understanding about language and how it works. Studies have shown that the average English native speaker knows about 20,000 words with university-educated people knowing around 40,000 words. When actually speaking and with everyday writing (emails, letters, notes etc.) this goes down to about 5,000 very common words that are used repeatedly.
The Oxford English Dictionary contains 171,476 words in current use, whereas a vocabulary of just 3000 words provides coverage for around 95% of common texts (Hu and Nation, 2000) for a learner to get by in English.
This animation gives an introduction to corpora, in particular the Oxford 3000. The concept of corpus is explained in more detail in week 5.
© University of Leicester