Skip main navigation

What does the BTK plot show?

Learners can practise reading a BTK plot and interpreting what the elements mean.

Now you understand how the BTK viewer displays data, you should be able to give a detailed answer to the questions from step 1.12 about Didymella arachidicola:

  • What is the average GC and coverage of the blue and green circles?
  • What taxonomic groups do the colours represent?
  • Does this BTK view represent a clean genome assembly of one species? If there is a contaminant or second organism, how many such organisms are there?

This step is here to help you apply what you’ve learned so far. It’s not a test, so if you’re still not sure, don’t worry! You can use the comments section to ask about anything you are finding confusing and we educators will give you some more explanation.

The BTK view of the Didymella arachidicola WOCF01 genome assembly shows two blobs, or sets of circles, coloured blue and green. Remember that the colours represent taxonomic assignments.

The BTK view of the Didymella arachidicola WOCF01 genome assembly Click to expand

Thus, the blue blob represents the target organism D. arachidicola which is a fungus of the phylum Ascomycota and has an average GC of 0.53 (53%) and a sequencing coverage of ~16, (i.e. there were approximately 16 copies of the full D. arachidicola genome in the sequencing library).

The green blob represents a separate organism, a bacterium of the phylum Proteobacteria. This blob has an average GC of 0.67 (67%) and a coverage of ~11.

If both sets of contigs had come from the same organism we would expect them to have approximately the same number of copies (i.e. coverage) and approximately the same GC. Although individual contigs and sequences can vary their coverage and GC, they always tend to be clustered together around the same value if they belong to the same organism.

This is the main reason why blob plots work: different GC-coverage blobs represent DNA from different organisms.

There are a few exceptions: Sometimes sex chromosomes may be present at half the coverage of the rest of the genome, (e.g. a human male sample with one X and one Y chromosome will show those contigs at half the coverage/depth compared to all the other chromosomes which are present in identical pairs). And sometimes some organelles in a species such as the mitochondria might have very different GC content than the rest of the nuclear genome. But, in general, the GC and Coverage of all the sequences from one organism will tend to cluster together.

The green Proteobacteria blob is definitely an extra organism in this fungal genome assembly, but we can’t tell from the GC-coverage plot whether it is a contaminant or a symbiont or an infection or food. Therefore a better term is cobiont (i.e. it co-occurs in the same DNA sample but we don’t know what role it plays in that sample). We will look at some further detail and examples of interpreting cobionts later in the course.

Do you understand how to interpret a basic BTK plot now? What further questions do you have at this stage?

© Wellcome Connecting Science
This article is from the free online

Eukaryotic Genome Assembly: How to Use BlobToolKit for Quality Assessment

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now