Skip to 0 minutes and 5 seconds Welcome to course two, Week 2. I’m Christine Boinett and I work at the Oxford University Clinical Research Institute. In this video, we’ll learn about the basic functions of BLAST. And for this, go to your favourite web browser and type in BLAST NCBI.
Skip to 0 minutes and 27 seconds You’ll reach the landing page where you’ll see four different search programmes and for this instance, we’ll use Nucleotide BLAST. On the search page, you’ll find three different sections. First is the Entry Query Sequence, the second is a Choose Search Set, and the third is a Programme Selection. For this instance, we’ll use accession number but you could also use the FASTA sequence or a file that you’ve previously saved. And for this you’ll use a Choose File function. We’ll use a sequence that we’ve used before, which is accession number X81322. In the Choose Search Set, there’s three different databases you can use– the human, the mouse, and others.
Skip to 1 minute and 13 seconds In this instance, we’re going to use the others, which is the non-redundant database which includes all the sequences that have been deposited in the repository for BLAST.
Skip to 1 minute and 25 seconds In this section, for organism, you can either include or exclude a particular taxa. For example, if I choose to include just bacteria, it’ll only search in the bacteria results. And you can either choose to include, leaving this box blank, or tick it to exclude. For now, we’ll just choose to exclude. If you then move to the third section, which is the Programme Selection, you’ll find three different modes of BLAST– the highly similar sequences, the more dissimilar ones, and somewhat similar. What’s important to note here is highly similar sequences, which is Megablast, uses 95% or more similar sequences.
Skip to 2 minutes and 9 seconds For more information of the types of BLAST, just click on the Information tab and you’ll get all the information for all three different versions of the BLAST, like so. We should then review what we’ve put in. At the top is our entry sequence and, as you can see, it’s auto-filled to the hpcC gene. We then use the non-redundant database and a bacteria taxa ID. In this example, because we’re using an E. coli gene, we don’t want to include the E. coli searches, and actually look for other organisms that might have the same gene. So here, we’ll type in E. coli, and a dropdown menu will come up with a taxon ID, 562. And at this instance, we’ll exclude the results.
Skip to 3 minutes and 3 seconds Now we’ll hit BLAST. You’ll then be taken to the searching function page. Whilst we’re waiting, it’s good to note that BLAST has become a really complex tool, and feel free to use any different options to look at different search functions. This is a result from the search. What you’ll see at the top here is what you exactly searched for, in this case, the E. coli hpc gene, what molecule it is, and also the length of the gene. On the right you’ll see what database we used and any other information that you need. Next is the graphical summary. At the top is the key, which I’ll come back to later.
Skip to 3 minutes and 43 seconds Next is the turquoise box, which represents the length of your nucleotide query, which is represented by 1,500 nucleotides, roughly. And you can see from the length, it’s roughly 1,500 nucleotides. Below, you can see the red bars, and these indicate your alignment scores. If you go back to the key, you’ll see numbers associated with the boxes. Here, in our results, most of our sequences– if not all– are over 200. And in this instance, the higher the score, the better the alignment. If we then minimise that, next is a Descriptions tab. Here, you’ll see the different organisms that have been similar to your search sequence.
Skip to 4 minutes and 26 seconds On the right hand side, you’ll see various numbers, which include your max score, total score, query coverage, E value, percentage identity, and accession of your subject. What’s important to note here is the E value. The E value, or expected value, is the percentage identity between the query and the subject, gives you the likelihood that this match was found by chance given the length of the sequence and the size of the database. The lower the E value, the better. If you minimise this, you want to go to the Alignment section. At the top is what your query sequence matched to and below is the two sequences– the query being yours and the subject being the Shigellosis entry.
Skip to 5 minutes and 9 seconds Here is also a graphical representation, where you can see a lines matching, which mean synteny, or matching, and where there is no match is a gap. If you want to explore more of your gene, you can always hit the Edit and Resubmit button, and it will take you back to the Query Sequence Search page. Here, you can change different parameters. For example, you can choose to include E. coli this time or, in fact, choose a bigger database of bacteria and see what results you get with that. That concludes our demonstration of BLAST. Here, we’ve used a known gene to look across different databases and seen the results of what matched or didn’t match in this instance.
Skip to 5 minutes and 52 seconds Hope you’ve enjoyed this example. Please feel free to leave any comments in the comments section below.
Use of BLAST (Basic Local Alignment Search Tool)
In this video tutorial we demonstrate the use of the similarity search engine BLAST, hosted by NCBI.
BLAST is a powerful tool used to search a database of DNA or protein sequences in order to find “hits” that are similar to a query sequence. BLAST is used for several purposes, including inferring the possible function of a protein.
The NCBI website includes a very user-friendly BLAST server through which one may search NCBI’s nucleotide and protein sequence databases.
In this video, Dr Christine Boinett will demonstrate many of the most useful applications of BLAST that are available on NCBI’s web server.
To get the most of this step, we recommend that you try to replicate the steps. You can do so by pausing the video and performing the tasks in your internet browser or you can watch it first and replicate the steps later.
© Wellcome Genome Campus Advanced Courses and Scientific Conferences