Skip to 0 minutes and 13 seconds In this video, we will examine BLAST. BLAST stands for Basic Local Alignment Search Tool. This tool allows you to take a sequence and search a large database to find sequences, which are similar or related to your sequence.
Skip to 0 minutes and 32 seconds Our starting point is NCBI web page. You can find the link in the text below this video.
Skip to 0 minutes and 41 seconds This is one of the most used tools for biologists. There are several BLAST programs. There is nucleotide BLAST, protein BLAST, and other translated BLAST programmes, which are a little bit more advanced. In general, if you have a DNA sequence, you will use nucleotide BLAST. If you have an Amino Acid sequence, you will use protein BLAST.
Skip to 1 minute and 9 seconds So I’m going to open up a sequence we are going to use for this video. This sequence is an example sequence recovered from DNA sequencing. We want to find out what this DNA is or find other sequences which are closely related to this particular sequence. So I’m now going to copy this sequence into a clipboard.
Skip to 1 minute and 37 seconds So let’s go back to the NCBI web page. Because we have a DNA sequence, we will click ‘Nucleotide BLAST’, which will take us to the BLAST input page.
Skip to 1 minute and 52 seconds This box at the top, ‘Enter Query Sequence’, is where we need to paste in input sequence from the clipboard. Next, we want to choose the database that we want to search against. Because we chose ‘Nucleotide BLAST’, the database has been automatically set to be Nucleotide collection. You can click the question mark to see what it is. Essentially, it is a nucleotide database which consists of the sequences pulled from various major DNA repositories across the world.
Skip to 2 minutes and 34 seconds Because we are not going to be searching against a particular organism, we will leave ‘Organism’ box empty, and leave the rest as they are.
Skip to 2 minutes and 46 seconds If I expand ‘Algorithm Parameters’ by clicking this, it will open some advanced parameters. When you become more familiar with BLAST, you can adjust those too. For this video, however, we’ll simply click ‘BLAST’ to start the process.
Skip to 3 minutes and 5 seconds After a few seconds or minutes, depending on the nature of your sequence and how busy the BLAST servers are, you will have an output screen showing the result. At the top, there are some basic results and a graphic summary.
Skip to 3 minutes and 22 seconds Let’s go right down to the actual hit. Here hits are sorted by the max score and, in most cases, by the e-value, or expected value, from the lowest to highest.
Skip to 3 minutes and 39 seconds What we look for here is something with a very, very low e-value. We can assume that any hits with an e-value less than 0.0001, homologous, or closely related to the input sequence. This is a general rule, but there are exceptions, which we will not discuss here. In our case, we have hits with e-values of 0.0, which is telling us that these hits are very close related. In fact, many of them are 100% identical to the input sequence.
Skip to 4 minutes and 17 seconds So it appears that we can safely assume that the input sequence is a piece of DNA that codes for a 16S ribosomal RNA, and it comes from a species of the genus Bacillus , although we can’t pinpoint which exact species. This is because there are many hits from different species of Bacillus , which are identical to this particular query sequence.
Skip to 4 minutes and 45 seconds What’s interesting is that if we scroll down, there are some hits with non-100% identity. So let’s click one of them, which will take us even further down the page and show an alignment. We can see the query sequence at the top and the ‘Subject’ sequence, which is Bacillus mycoides at the bottom.
Skip to 5 minutes and 11 seconds Here, we can see that it has 1,460 matches out of 1,461 bases. By closely examining the alignment, we can find where the mismatches here– the C and T. This is what we call a single nucleotide polymorphism, or a ‘SNIP’. And it is an example of how DNA sequences change or evolve. If we want to start all again with a new query sequence, we can go to the top, click ‘Edit’ and ‘Resubmit’ bottom. This will take you back to the BLAST main page, where you can restart with any adjustments.
How to analyse a DNA sequence
In Step 2.5, you watched Harriet demonstrate how to isolate bacteria from a sample, make a pure bacterial culture and perform a Gram stain.
When we were watching Harriet demonstrate these techniques, we started to identify the species of soil bacteria by observing some of its characteristic traits, including: the appearance of the colonies on agar, which we refer to as the colony morphology; and the appearance of the Gram stained cells under the microscope, which we refer to as the cellular morphology. These two features of our bacterial soil isolate are part of a set of observable traits that we refer to as a phenotype.
In order to identify the soil bacteria, we could continue working with the pure culture in the laboratory and perform a variety of tests to determine other phenotypic traits (eg oxygen requirement, ability to ferment different sugars, production of antibiotics or toxins, motility). You will learn more about some of these tests and how they can be used to identify microbes later in the course.
A microbe’s phenotype is largely determined by its genetic material, and so an alternative strategy is to use molecular techniques such as PCR and DNA sequencing to analyse the microbial genome directly. In this video Dr Soon Gweon, Lecturer in Bioinformatics for Genomics, explains how computer software can be used to analyse DNA sequencing results to identify microbes and detect genetic mutations, including single base changes (called single nucleotide polymorphisms, SNPs).
In the next Step, you’ll have an opportunity to identify a different soil isolate using the BLAST software. Don’t forget to mark this one as complete before you move on.
© University of Reading