Skip main navigation

Things to consider when collecting sociolinguistic data

In this article, Dr Sarah Kelly explains some of the key factors sociolinguists think about when collecting speech data.
Data surrounding a ball

Let’s begin this activity by looking at some of the main considerations sociolinguists must face when they want to conduct their own research.


No matter who is being studied, or what data is being collected, it is crucial to consider how our research might affect whoever is involved. When conducting interviews (or using other sociolinguistic data collection methods), there is always the chance that people might give you personal information. The process of being involved in research might also draw out negative emotions or memories. This is a particular risk when we’re asking people to talk freely about their own identity or experiences. There may also be a risk to researchers during data collection too. For example, he or she might travel to dangerous places or be interacting with vulnerable people.  

For these reasons, in sociolinguistic research, you must get permission to do the research from your university or institution. Also, it is good practice to allow participants to learn about your study before they consent to taking part. Usually, consent is gained by providing participants with an information sheet along with a consent form to sign if they feel able to take part in the research.   

One potential risk of giving participants lots of information about your research is that they will then go on to change their speech from what is ‘normal’ to them. This effect can be lessened by a researcher providing less specific detail to the participants before the data collection begins. For example, you might want to collect data on how people pronounce the ‘a’ vowel in words like ‘grass’ or ‘giraffe’. You might tell your participants that you want to analyse their speech, but withhold the fact that you’re only really interested in how they produce ‘a’ vowels. Then, at the end of the data collection, the researcher can disclose more information about the exact purpose of the research.

Sample size

If sociolinguists want to study language variation, how many people do they need to study to be able to make robust claims? There is really no right answer to this one. How many people are involved in your research will depend on many factors. For instance, maybe there aren’t many people available to you who belong to the group you’re interested in. There’s also the task of analysing all of the data that you’re collecting. If a linguist is working by themselves on a project, it makes sense that more data would take longer (or be more difficult) to analyse.   

Ideally, though, researchers should aim to collect roughly equivalent quality and quantity of data for each social group being studied. If you wanted to compare male and female speakers of Middlesbrough English, for example, it would be ideal to have even numbers of male and female speakers. It would also be useful if these male and female participants produced roughly the same amount of speech data.


‘Sampling’ refers to how researchers go about selecting their participants. One method is to randomly choose people who might want to take part in your research. For example, you might select people from a telephone directory. Another more commonly-used approach to sampling would be to select participants that you think would be most suitable for your research. Let’s say for instance, that you’re interested in researching the speech of your own Punjabi-speaking community. You could then make use of your own social network to find people who would be suitable candidates. This approach would also likely help participants to feel that they could trust you with their data.

Identifying social factors

One of the key things that sociolinguists look for in their research is evidence of language variation. This variation would tie in with some sort of social factor relevant for the people being studied, like their national identity or how they would define their gender. Identifying these social factors might be done externally – this is where a researcher decides who to study based on what he or she believes. For example, if a researcher wanted to explore how speech is affected by social class, he or she might look for participants in areas they think are more or less economically well-off. On the other hand, other sociolinguists would allow their participants to identify their own social factors. This approach is particularly beneficial when trying to study a community that the researcher is less familiar with. The idea here is that a member of a group or community would be able to make finer distinctions between the members of their own community. 

© University of York
This article is from the free online

An Introduction to Sociolinguistics: Accents, Attitudes and Identity

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now