Contact FutureLearn for Support
Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.
DNA Sequencing
Sequencing chromatogram

Looking at DNA: Sequencing technologies

DNA sequencing is the process of determining the order of the bases – adenine, guanine, cytosine and thymine – in a molecule of DNA.

In the mid-70’s, a scientist called Fred Sanger developed a DNA sequencing method, eponymously known as Sanger sequencing, which revolutionised molecular biology. Unravelling the genetic code allowed a vast breadth of scientific applications to take place, from basic science through to translational applications such as diagnostic testing and targeted drug therapy.

Improvements over the years to Sanger’s original method allowed scientists to sequence sections of DNA up to around 600 bases in length. Because scientists could only sequence one small section of DNA at once, the length of time, and cost, required to sequence whole genomes remained prohibitive. Next generation sequencing (NGS) methods solved this problem by allowing hundreds of thousands of fragments of DNA to be sequenced at the same time - known as massively parallel sequencing approaches.

The term “next generation sequencing” (NGS) refers to many different methods used to sequence DNA. However, all methods follow the same basic principles:

  1. Sample DNA is processed into smaller fragments for sequencing.
  2. The sequence of bases in many different fragments of DNA is read at the same time using NGS technology. The number of fragments sequenced at the same time ranges from hundreds to millions, depending on the type of sequencing being undertaken.
  3. A computer file is generated containing the base sequences derived from the DNA fragments. Each individual length of sequence, which arose from the original DNA fragment, is known as a “read”. Read length is usually between 50-300 bases long, but can be longer depending on the NGS method used.
  4. Specialised software analyses the reads and matches them back to the specific place in the genome they arose from, using a reference genome sequence as a template. This is known as “alignment” or “mapping”.
  5. Differences between the sample DNA and the reference DNA are identified. This is known as “variant calling”.
  6. The likely effect that a genetic variant will have on a protein is identified. This is known as “variant annotation”.

It is possible to sequence the whole human genome quickly, and relatively inexpensively, using these techniques. When we sequence the whole human genome, we identify the full extent of human variation: + 5-10 million genetic variants per person including 20, 000 “coding” variants which fall within transcribed genes

For many applications of NGS we are trying to find a single genetic variant relevant to a specific disease or trait. Finding the one variant we are interested in, amongst this vast amount of genomic data is akin to trying to find a needle in a haystack. Methods have therefore been developed which allow us to sequence smaller regions of the genome. This results in less variation to analyse. For instance, we could look just at the 1-2% of the genome which codes for proteins - the “exome”, or we could only look at the regions of the genome which harbour specific genes we are interested in - “gene panels”. This allows us to sequence just the portion of the genome we think is most likely to yield the relevant variation, and ignore the rest. Methods which allow us to identify smaller regions of the genome for sequencing are known as “target enrichment”, “capture” or “pull down” techniques.

Talking point

Now we can analyse both the chromosomes and DNA sequence at high resolution, do we have the tools to diagnose all genetic susceptibility to disease?

Share this article:

This article is from the free online course:

The Genomics Era: the Future of Genetics in Medicine

St George's, University of London

Course highlights Get a taste of this course before you join:

  • Welcome to Week 1
    Welcome to Week 1

    In this video, Lead Educator, Dr Kate Tatton-Brown welcomes learners to the course and explains the course aims and outcomes.

  • Did you know?
    Did you know?

    Our resident scientist tells you his favourite genomics facts.

  • Errors in recombination
    Errors in recombination

    This video describes how structural chromosome abnormalities occur when errors occur in recombination.

  • Responsibility in the genomic era
    Responsibility in the genomic era

    In this tutorial, you will hear from Dr Carwyn Rhys Hooper on the concept of responsibility for health.