Skip to 0 minutes and 8 seconds Hello and welcome to the Corpus MOOC,
Skip to 0 minutes and 11 seconds or as the full title says: ‘Corpus linguistics: method, analysis, interpretation. In this course, you will get a practical introduction to the methodology of analysis of large language data. You will learn how to collect, search and interpret this data with the help of specialized software, we have developed and you can use for free. Corpus linguistics is an extremely versatile method with a whole range of applications in social science research, the digital humanities as well as in practical areas such as marketing, journalism, language learning, language teaching and testing, textbook writing and so on. Behind me is the ESRC Centre for Corpus Approaches to Social Science, Lancaster University or CASS, as it is known.
Skip to 1 minute and 2 seconds Here some of the major corpus projects were conceived of, designed and indeed carried out. Think of corpora such as the British National Corpus 2014, 100 million words of current British English. All of these were compiled in the offices behind me. CASS specializes in building corpora developing, cutting-edge software and applying corpus linguistic techniques across social sciences and the digital humanities. So if you have a question that can be answered by looking at language data, we can teach you how to look for the answer.
Skip to 1 minute and 44 seconds For example, if you are interested in how the verbs ‘to study’, ‘to research’ and ‘to investigate’ are used in current British English or what their connections are and what they have in common, you can use #LancsBox to get the answer. This tool, developed at Lancaster University, allows you to analyse and visualise large amounts of language, millions and billions of words. In this graph you can see the words typically associated with ‘study’, ‘research’, and ‘investigate’, and also the fact that there are a number of shared associations, one of them, ‘extensively’, typically occurs with all three verbs. So there was one example of using corpus data and there are many, many more to come.
Skip to 2 minutes and 37 seconds If you join the MOOC, we are going to share our extensive experience with you related to corpus research. We will take you on a journey from the basics of how to think about corpus data as a sample of language to the more complex questions of how to look for patterns in the data and how to analyse data statistically. We will provide numerous examples of applications, ranging from the analysis of discourse to sociolinguistics and applied linguistics. So let’s start this journey together!