Skip to 0 minutes and 9 secondsSo all these examples were used to demonstrate different aspects of the process of corpus building. We looked at design, development, and corpus annotation. We would like to now look at some of the general principles and draw some conclusions for building your own corpora. Here are five tips that you might like to follow if you are thinking about building your own corpus. First, start with corpus design. Think carefully about the type of language your corpus should represent. Should it be a general corpus, or should it be a specialised corpus? Second, keep notes. Throughout the process of the corpus design and corpus development, keep notes about your decisions-- what you included and what you excluded and why.

Skip to 1 minute and 7 secondsYou might remember today, you might remember tomorrow, but in a week's time or in a year's time, these notes will be really important. In terms of practicalities of saving the data, it is advisable to save texts as separate files so that you can look at the distribution of linguistic features in different types of texts and in different components of the corpus. Four, always check accuracy of the data, be the spoken data or the written data. If it is spoken, you might like to re-listen to the recordings to make sure that the transcriptions are done accurately. If it is written data, you might like to look at the type of data.

Skip to 1 minute and 52 secondsIf you've downloaded the data from the internet, for instance, you might like to make sure that you don't include any of the HTML code or the boilerplate, for instance. And finally, select a suitable tool that will be useful for the analysis of the corpus. In the practical sessions, we will be looking at #LancsBox and how to use #LancsBox to build and analyse your own corpus. So these were the tips for building corpora. Thank you very much for listening to this lecture.

Part 5: Building your own corpus - tips

Finally, Vaclav Brezina provides five tips for corpus building.

Update your journal

After the video, don’t forget to update your journal! Keep a record of what you are learning. You will find it really helps as the course proceeds if you keep clear, structured notes of what you have learnt.

Share this video:

This video is from the free online course:

Corpus Linguistics: Method, Analysis, Interpretation

Lancaster University