Skip main navigation

Hurry, only 11 days left to get one year of Unlimited learning for £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

When and why to obtain sequencing data

Article discussing why to obtain sequencing data
A person holding a paper note written
© COG-Train

As humans, we have an innate ability and even NEED to spot patterns. Psychologists have found that we use patterns to build rules that guide us to make faster and more accurate decisions.

So, what does this have to do with sequencing data? EVERYTHING!

Let’s take a step back. What do we know about SARS-CoV-2? We know that it accumulates ~33 mutations per year per genome. These mutations are important because we use them to track the spread and evaluate the effectiveness of our interventions. They also enable us to classify SARS-CoV-2 into lineages, clades and variants and to observe specific symptoms and the severity of the disease.

Sequencing data not only gives us insights into the size and growth rate of an epidemic but also monitors the evolution and spread of variants. In the long term, rapid and large-scale sequencing allows us to track new variants to aid in vaccine development.

So, we use a collection of sequenced data to spot trends and patterns during genomic comparison, so that we can classify data and respond to a crisis faster and more accurately than if we did not see a pattern. It allows us to make some predictions. Predictions such as, “we are due for another viral outbreak.”

When and why would a laboratory or an individual researcher want to obtain sequencing data?

When you have a biological sample and you do not have prior knowledge of where all the significantly expressed regions are, and so you cannot design primers to amplify those regions, but you really want to know which coding genes are present, which non-coding regions are expressed, as well as where all the mutations are across this genome. In this case, you can apply whole-genome sequencing.

If you want to know where all the mutations in the protein-coding regions are, you can use whole-exome sequencing. If you are interested in which genes are differentially expressed in one sample versus another, you might use RNA sequencing. There are many types of sequencing and technologies, depending on our research questions.

Another interesting justification for sequencing data is when you have a very little amount of sample, severely degraded samples, or a mixed sample. This is typically the case in forensics. For these types of analyses, the conventional PCR approach is not suitable, but sequencing can provide valuable insights.

If you have a hypothesis you wish to test, but you don’t have any biological samples of your own, or you have samples, but you just don’t have enough to work with. To detect significant effects in your data you need a sufficient amount of data points to back it up and to actually SEE THE PATTERNS. You need to consider statistical power and sample size.

You can still do research without generating your own data. Data concerning many different sample types, such as virus or (micro)organism, geographical location, race, sex, age group and disease can be found in FREE data repositories. This data repository guidance from Scientific Data will guide you to an array of public data repositories that you might find useful.

© COG-Train
This article is from the free online

Making sense of genomic data: COVID-19 web-based bioinformatics

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now