• Lancaster University

Corpus Linguistics: Method, Analysis, Interpretation

Get a practical introduction to the methodology of corpus linguistics for researchers in the social sciences and humanities.

75,319 enrolled on this course

Corpus Linguistics
  • Duration

    8 weeks
  • Weekly study

    3 hours

Learn how to build and query corpora in this practical course

On this course, you’ll get a practical introduction to corpus linguistics, an extremely versatile methodology of language analysis using computers.

Over eight weeks, you’ll build the skills necessary to collect and analyse large digital collections of text (corpora).

You’ll be introduced to a number of topics demonstrating the use of corpora in areas as diverse as discourse analysis, sociolinguistics, and language learning and teaching.

Download video: standard or HD

Skip to 0 minutes and 8 seconds Hello and welcome to the Corpus MOOC,

Skip to 0 minutes and 11 seconds or as the full title says: ‘Corpus linguistics: method, analysis, interpretation. In this course, you will get a practical introduction to the methodology of analysis of large language data. You will learn how to collect, search and interpret this data with the help of specialized software, we have developed and you can use for free. Corpus linguistics is an extremely versatile method with a whole range of applications in social science research, the digital humanities as well as in practical areas such as marketing, journalism, language learning, language teaching and testing, textbook writing and so on. Behind me is the ESRC Centre for Corpus Approaches to Social Science, Lancaster University or CASS, as it is known.

Skip to 1 minute and 2 seconds Here some of the major corpus projects were conceived of, designed and indeed carried out. Think of corpora such as the British National Corpus 2014, 100 million words of current British English. All of these were compiled in the offices behind me. CASS specializes in building corpora developing, cutting-edge software and applying corpus linguistic techniques across social sciences and the digital humanities. So if you have a question that can be answered by looking at language data, we can teach you how to look for the answer.

Skip to 1 minute and 44 seconds For example, if you are interested in how the verbs ‘to study’, ‘to research’ and ‘to investigate’ are used in current British English or what their connections are and what they have in common, you can use #LancsBox to get the answer. This tool, developed at Lancaster University, allows you to analyse and visualise large amounts of language, millions and billions of words. In this graph you can see the words typically associated with ‘study’, ‘research’, and ‘investigate’, and also the fact that there are a number of shared associations, one of them, ‘extensively’, typically occurs with all three verbs. So there was one example of using corpus data and there are many, many more to come.

Skip to 2 minutes and 37 seconds If you join the MOOC, we are going to share our extensive experience with you related to corpus research. We will take you on a journey from the basics of how to think about corpus data as a sample of language to the more complex questions of how to look for patterns in the data and how to analyse data statistically. We will provide numerous examples of applications, ranging from the analysis of discourse to sociolinguistics and applied linguistics. So let’s start this journey together!

What topics will you cover?

  • Introduction to corpus linguistics and basic techniques: concordancing
  • Further corpus techniques: collocation and keywords
  • Corpus-based discourse analysis
  • Building a corpus: tagging and processing data
  • Sociolinguistics: analysing BNC1994 and BNC2014
  • Textbook and dictionary construction
  • Language learning and corpus linguistics
  • Swearing extravaganza: looking at language and society

When would you like to start?

  • Date to be announced

Add to Wishlist to be emailed when new dates are announced

Learning on this course

On every step of the course you can meet other learners, share your ideas and join in with active discussions in the comments.

What will you achieve?

By the end of the course, you‘ll be able to...

  • Interpret corpus data using techniques such as concordancing, collocation and keywords.
  • Describe the main methodological underpinnings behind corpus linguistics.
  • Apply corpus linguistic techniques to the analysis of different types of data.
  • Collect own corpora.
  • Design research studies using corpus methods.
  • Explain corpus methods as well as a range of applications of this versatile methodology.
  • Perform corpus analysis using a range of corpus tools such as #LancsBox, CQPweb, USAS and BNClab.

Who is the course for?

This course is designed for anyone with an interest in the study of language.

Who will you learn with?

Professor in Corpus linguistics at Lancaster University, lead developer of #LancsBox.

Has been working for over 20 years to help pioneer new ways to use computers to analyse very large collections of language data.

Who developed the course?

Lancaster University

Lancaster University is a collegiate university, with a global reputation as a centre for research, scholarship and teaching with an emphasis on employability.

Learning on FutureLearn

Your learning, your rules

  • Courses are split into weeks, activities, and steps to help you keep track of your learning
  • Learn through a mix of bite-sized videos, long- and short-form articles, audio, and practical activities
  • Stay motivated by using the Progress page to keep track of your step completion and assessment scores

Join a global classroom

  • Experience the power of social learning, and get inspired by an international network of learners
  • Share ideas with your peers and course educators on every step of the course
  • Join the conversation by reading, @ing, liking, bookmarking, and replying to comments from others

Map your progress

  • As you work through the course, use notifications and the Progress page to guide your learning
  • Whenever you’re ready, mark each step as complete, you’re in control
  • Complete 90% of course steps and all of the assessments to earn your certificate

Want to know more about learning on FutureLearn? Using FutureLearn

Learner reviews

Learner reviews cannot be loaded due to your cookie settings. Please and refresh the page to view this content.

Do you know someone who'd love this course? Tell them about it...

You can use the hashtag #corpusMOOC to talk about this course on social media.