Skip main navigation

BlobToolKit – what is it?

About BlobToolKit, software suite for identifying contaminants and other species present in eukaryotic genome assemblies

Let’s start with BlobToolKit

BlobToolKit is a software suite for identifying contaminants and other species present in eukaryotic genome assemblies. When researchers sequence the genome of any species, non-target organisms are often present in the DNA sample.

BlobToolKit, or BTK as we like to call it for short, is available as a set of command-line tools for processing the genome assembly files, read data files, and other analysis files, and turning them into plots that can be interactively explored using any web browser. The BTK pipeline is being run on all publicly available genome assemblies and the BTK plots are available at a website where anyone can explore them. In step 1.11 we explain why some public assemblies aren’t yet on the BTK website.

BTK works because if there are different organisms in a sample, they will almost always be present in slightly different amounts, and will have slightly different DNA characteristics. Therefore, when we extract their DNA and sequence the sample, one organism is likely to have a different sequencing depth (also known as coverage) from another, and different average GC content from another. When we plot these characteristics (coverage and GC) of all the sequences in an assembly on a two dimensional scatter plot, different organisms typically show up as different blobs.

You will learn more about interpreting the BTK plot later in the week. This diagram gives you a general idea of how the blobs might look when separated by coverage and GC: Schematic diagram of a BTK scatterplot with coverage on the Y axis and GC content on the X axis, and several coloured blobs at different positions

Figure 1: A cartoon BTK plot showing the mollusc Cypraea chinensis contigs in green in a separate blob from other organisms present in the same assembly

In this course, you will learn why BTK is an essential tool for anyone who wants to assemble genomes from DNA read data, or anyone who wants to use these genome assemblies in their research. You will learn how to use the public BTK viewer website and how to apply different filters to look for contaminants, parasites, and other non-target organisms in these public genome assemblies.

© Wellcome Connecting Science
This article is from the free online

Eukaryotic Genome Assembly: How to Use BlobToolKit for Quality Assessment

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now