Learn more about this course.

The added value

Find out about the role of quantitative information in our understanding of language.

Examples taken from corpora, unlike artificially constructed examples, show evidence for how language is actually used by people.

This is one of the main strengths of corpus-based and corpus-driven lexicography. But there is another, very important aspect added by corpora to dictionary making. Corpora provide quantitative information.

Imagine you are drafting the dictionary entry for ’chairperson’ and you are in doubt about whether the spelling ‘chair-person’ is more common than ‘chairperson’. How would you know? A corpus will tell you, and if you choose an up-to-date general language corpus that is large enough, you can safely rely on it to make a decision about your entry. Of course, corpora may also contain mistakes or typos, and it’s always important to keep this in mind when dealing with them.

Frequency information offers lexicographers empirical evidence against which they can compare their intuitions, and helps them make decisions about which words or senses to include, and how to present them to the users.

Want to keep
learning?

This content is taken from
Coventry University online course,

Understanding English Dictionaries

View Course

Linguists have studied frequency effects in language for a long time. One well-known fact about language is that word frequencies have a skewed distribution. You probably know already that some words like ‘take’ or ‘and’ are more frequent than others like ‘diagonalise’ or ‘amoxicillin’. What is more, a relatively small number of words (the most frequent ones) tend to cover a very large proportion of the words found in a text. This is usually known as Zipf’s Law, from the name of the linguist George Kingsley Zipf, who popularised it. In simple terms, following Zipf’s Law we can expect to see that the most frequent word in a text occurs about twice as often as the second most frequent word, three times as often as the third most frequent word, and so on. In other words, the first 15 words will account for 25% of the text, the first 100 will account for 60%, and the first 1,000 for 85%. The first 4,000 will account for 97.5%.

According to Zipf’s Law, when lexicographers work on rare words, they need very large corpora, typically in the range of several hundred million or a few billion words. This is because in a medium-sized corpus rare words may simply never be seen. Certain dictionaries have used corpus frequency information to highlight the most common words in a language. For example, as we saw in Week 2, the Macmillan Dictionary presents a core vocabulary of the top 7,500 most frequent word in the English language as ‘red words’, thus signalling that these are the most important words to learn for people whose first language isn’t English.

Want to keep learning?

This content is taken from Coventry University online course

Understanding English Dictionaries

View Course

See other articles from this course

This article is from the free online

Understanding English Dictionaries

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

The added value

Want to keep
learning?

Understanding English Dictionaries

Further reading

Want to keep learning?

Understanding English Dictionaries

Understanding English Dictionaries

Understanding English Dictionaries

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Learn more about this course.

The added value

Want to keep learning?

Understanding English Dictionaries

Further reading

Want to keep learning?

Understanding English Dictionaries

Share this

Understanding English Dictionaries

Understanding English Dictionaries

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Want to keep
learning?