Skip main navigation

£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

Find out more

How to find new words automatically?

Barbara interviews Emmanuel Cartier about his project Neoveille ( on neologism tracking.
Hello! Today I’m with Emmanuel Cartier. Hi, Emmanuel, can you say a few words about yourself? Yeah, thank you. So I am Emmanuel Cartier, an Assistant Professor at the University of Paris 13, and researcher at the National Centre for Scientific Research in France, and I am a linguist and a computational linguist. Can you tell us a little about your research project and its challenges? Since 2015 we have been working on a research project called Neoveille which is funded by the ANR French Research Agency. So, the aim of this research is to automatically detect new words and meanings in seven languages, French, Polish, Czech, Greek, Brazilian, Portuguese, Russian and Mandarin Chinese.
So, we work with research partners in eight different countries from France to China. This project has many challenges. One challenge is how to automatically find neologisms. There are three main ways in which neologisms are introduced in a language. One way is the creation of a new word, for example, with derivation (like ‘Twittery’), or composition, - ‘tweet clash’, ‘fact tweet’ for example. So, to detect this type of neologisms the main idea is to check if a word appears in a reference dictionary. But electronic dictionaries are not available for every language and they do not cover the whole vocabulary of the language. Texts also contain spelling mistakes. So, we combine several techniques like machine learning.
Another way of adding new words in a language is through borrowings from other languages. To find new borrowings we can use also dictionaries but in the multilingual and connected era we live in a lot of texts have foreign words that are not borrowings so we need to filter those out. A further way to add innovation in a language is with new usage of existing words which are also called semantic neologisms. For example, the meaning of mouse as a computer pointing device was created from the initial meaning referring to the animal and then translated into other languages. How can we find words automatically?
We have to rely on more sophisticated techniques like statistics and recent advances in semantic analysis especially the distributional semantics. These methods track how a word is used and detect changes in its profile over time. And finally, another challenge in this project is how to follow and track the emergence and spread of neologisms. We can see that frequency is a first good indication but most neologisms just occur once and then disappear. That’s really interesting. And can you tell us about the findings and the results of your project so far? Yeah. So thanks to this project we can answer many interesting
questions like: What is the importance of neologisms in the history of languages? Are there specific communities from which neologisms mainly arise? And can we explain why many neologisms only appear once and a few are adopted by everyone? So, during the project we have achieved several results. One main achievement is the web platform available at The general public can consult the results on the website. A linguist can use this platform to add new web sources of information. They can also approve the neologisms that automatically detected and describe them. So, another achievement is an exhaustive analysis of neologisms in French. So, we have described more than 20,000 neologisms from 250 web sources which have appeared in the past three years.
All of those results together with publications are available on the website. That’s really exciting. And what are your future plans for the project? Yeah in the future we plan to do an exhaustive study of languages other than French. So far we have researched Italian, Greek, Portuguese, Czech and Polish. We have collected more than 10,000 neologisms for each language. We are describing and analysing them and we intend to publish the Results next year. Semantic neologism detection is still at an early stage, and the time span of our corpora is not yet sufficient to draw any conclusions on models of spread.
So, we are also in contact with dictionary editors to track new words and new usage to update existing dictionaries, and we are setting up a research network on lexical innovation at the European level. Thank you.

Barbara McGillivray interviews computational linguist Emmanuel Cartier about his project Neoveille on neologism tracking.

Emmanuel Cartier is Assistant professor at the University of Paris 13.

The video is primarily about the Neoveille project, which explores how new words and meanings are detected from a range of global languages.

Further reading

Cartier, E. (2017) ‘Neoveille, A Web Platform for Neologism Tracking’. in: Proceedings of the EACL 2017 Software Demonstrations. Valencia, Spain. 3-7 April 2017, 95-98. available from

Cartier E., Sablayrolles J.-F., Boutmgharine N., Humbley J., Bertocci M., Jacquet-Pfau C., Kübler N. et Tallarico G. (2018) ‘Détection Automatique, Description Linguistique et Suivi des Néologismes en Corpus: Point d’étape sur les Tendances du Français Contemporain’. Actes du Congrès Mondial de Linguistique Française, Mons (Belgique). held 9-13 Juillet 2018. available from

Cartier, E. (2019) ‘Neoveille, Plateforme de Détection, de Repérage et de Suivi des Néologismes en Onze Langues’. Neologica. available from

This article is from the free online

Understanding English Dictionaries

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education