Looking at all the species in a genome assembly

In this video, Sujai explains how to use the BTK web viewer to identify all the different organisms in a complex plot.

Let’s take a look at the genome assembly of Lucilia cuprina, the Australian sheep blowfly, which infects sheep with a disease called “sheep strike” or “flystrike”

BTK plot for Lucilia cuprina, Australian sheep blowfly

The first thing that jumps out is that there are multiple blobs and different colours signifying different taxonomic groups. The legend on the top right of the plot suggests that there are more than two species present in this genome assembly. However, it can be hard to see all of them because some points may be drawn on top of other points. In this section you’re going to see how to use the BTK viewer in a few different ways to examine all the different hits systematically.

To summarise the video, there are three main ways to identify separate organisms within a BTK plot:

  • Clear and separate blobs – you can zoom in to the plots by changing the axes in the Settings Menu to check if you see separate blobs
  • Consistent taxonomic assignment at higher and lower taxonomic ranks – remember to check different taxonomic levels – Phylum, Order, Class, Family, etc
  • Zero coverage in some read sets – If there is more than one readset available for calculating sequencing coverage, then plotting one read set on the X axis and one read set on the Y axis should show all contigs falling on a diagonal line (i.e. proportional coverage in both read sets). If any contigs are zero or very low coverage in one read set, then that is strong evidence that they are contaminants, because there are no reads in that read set that map back to those contigs.
