Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £29.99 £19.99. New subscribers only. T&Cs apply

Find out more

PART 1: Corpus linguistics and sociolinguistics: Introduction

This video presents the key terminology for sociolinguistic exploration of corpora.
Hello, my name is Vaclav Brezina. And in this video I will be talking about sociolinguistics and corpus linguistics and how these two disciplines can work together. So in this lecture, we are going to move away from the pure description of language as seen in dictionaries and in grammar books. These are wonderful descriptions, but what we are going to do is, we are going to look at the realm of language variation, go beyond the rules and look at how these rules change and vary across different contexts and across different speakers. And that’s actually what variationist sociolinguistics is interested in.
So when we look at a map of a country, such as the map of the UK here, and we think about how language is used across the country by different speakers, different genders, different age groups, people in different regions, people from different socioeconomic groups, and so on and so forth, we can see that there’s a lot of variation. And this variation is not free variation, but the variation has some constraints and conditions that we can investigate as part of our sociolinguistic inquiries.
So what I would like to do now is to invite you on a journey together across the UK using corpus evidence to see how language is used out there in the wild, if you like, how different people, different social groups use language. And for this travel, you don’t need any train tickets or bus tickets, all you need to have is a computer and access to the corpora I’ll be talking about. So it is a very cheap journey if you are short of travel money. So you can still do that very efficiently.
So with the existence of corpora, such as the British National Corpus that was originally compiled in 1994 and a brand new corpus, the British National Corpus 2014 that we are currently developing in Lancaster University and the spoken part of which is already available, we can actually investigate sociolinguistic variation on a large scale. So we have these two beautiful corpora, the British National Corpus 1994 and the British National Corpus 2014, which are 20 years apart. And what we can investigate is both how language has developed over the period of 20 years between these two corpora, but also what the current use of language is in the UK based on the British National Corpus 2014.
So before we look at some examples of how these corpora can be used, I would like to talk about some basic linguistic notions and some theoretical background for this type of investigation. So first of all, there are two main approaches to sociolinguistic variation, the so-called Labovian traditional variationist approach that tries to define something called sociolinguistic variables. Sociolinguistic variables, as Labov says, are different ways of saying the same thing. What we are, in effect, trying to do is we are looking at the competition between two different forms that can occur in similar contexts. And we look at how these are conditioned by various contexts, social, and also linguistic and internal contexts.
So to give you an example, here’s an example of a typical Labovian variable– an utterance that says, “I don’t think we should go out any more.” Well, I have a choice here, a linguistic choice. Without changing the meaning of this utterance, I can also say, “I don’t think we should go out no more.” And again, it would be very interesting to see what the conditioning of these two options or linguistic variants, if you like, would be. So in the traditional variationist context, we would investigate the competition between the “no” variant and the “any” variant.
On the other hand, so a broader approach that I call the Biber’s approach to language variation focuses on the speakers and writers choices from a large inventory of lexico-grammatical features and looking at the functional variation there. So there’s no clear meaning preserving competition between linguistic variants that could be easily analysed in the terms of Labovian variables, something that I call ambient variables, variables that can appear anywhere and are not in direct competition. Again, to give you an example here, here you have an utterance taken from the British National Corpus and you can see that in this very short utterance, there are many expressions, such as, “well, you know, you see, I don’t know, I suppose.”
These are what we call hedges, discourse features that mitigate the strength of the utterance. Again, these hedges are very important. They signal informal, spoken, dialogic language and are much more frequent in these types of registers. However, they are not in direct competition with any other variables. So they can appear anywhere or nowhere at all. So now we’ve looked at some basic principles and basic approaches to sociolinguistic variation. In the next segment, we will look at the examples of some basic investigations of sociolinguistic variation.

Vaclav Brezina introduces sociolinguistics and different approaches to language variation.

In the lecture, he makes reference to the BNC2014, a new corpus which is being developed at Lancaster University.

You may want to open the webpage in another browser window.

This article is from the free online

Corpus Linguistics: Method, Analysis, Interpretation

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now