Humans and corpora
“Each day most Merriam-Webster editors devote an hour or two to reading a cross section of published material, including books, newspapers, magazines, and electronic publications … The editors scour the texts in search of new words, new usages of existing words, variant spellings, and inflected forms – in short, anything that might help in deciding if a word belongs in the dictionary, understanding what it means, and determining typical usage.”
“The OED requires several independent examples of the word being used, and also evidence that the word has been in use for a reasonable amount of time. The exact time-span and number of examples may vary: for instance, one word may be included on the evidence of only a few examples, spread out over a long period of time, while another may gather momentum very quickly, resulting in a wide range of evidence in a shorter space of time.”
Your task
What kind of information about new words do you think a corpus can provide, but a team of human readers cannot provide?What kind of information about new words do you think a team of human readers can provide, but a corpus cannot provide?
Further reading
Select the following link for more additional information about the way Oxford Dictionaries are created.Our purpose is to transform access to education.
We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.
We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.
Learn more about how FutureLearn is transforming access to education