DIY BLAST. Part 1 - Step-by-step

In this article, you will learn how to do a BLAST search. You will be guided in a step-by-step manner through the process of submitting a BLAST search using the NCBI BLAST server.

In this first task, you will use an accession number to retrieve a nucleotide or protein sequence from Genbank and use BLAST to find similar sequences within the databases specified.

  1. Go to the Genbank/NCBI front page https://www.ncbi.nlm.nih.gov/
  2. Type in “WP_000161708.1” in the search box and make sure you leave the “All Databases” option from the drop down menu. This will allow the search to be performed across many databases and it will not be restricted to nucleotide or protein databases so far, we don’t know if WP_000161708.1 is a DNA or protein entry.
  3. Click Search. What type of results are shown? Hint: there will be a direct link to the entry that corresponds to the accession number WP_000161708.1 as well as an entry under the “Protein” category.
  4. Navigate to the entry by clicking on “invasion-associated secreted protein” or use this link. Is this entry a nucleotide or protein sequence?
  5. You will now perform a BLAST search to look for similar sequence. First, you will have to retrieve the sequence. To do so, go to “Send to” on the top right, click on the drop down menu, select “File” and under format, select “FASTA”.
  6. Download the sequence and open the file in a text editor.
  7. Navigate to the BLAST server
  8. Which BLAST programme should you use based on the query type (nucleotide or/protein) and the database type, nucleotide or/protein? Select one from the options.
  9. Paste the query sequence in the search box. Leave all the options as shown by default but click on “Show results in a new window” found at the bottom of the page next to the BLAST button. Now click BLAST and wait for the results.
  10. Study the results. What organisms are shown to have similar sequences to your query? How similar are these sequences to your query? Hint: compare E-values and scores.
  11. Let’s repeat the BLAST search and change some parameters. You can edit your search by clicking on the “Edit and Resubmit” link on the top left or by returning to the original tab and editing the parameters.
  12. Let’s change the database parameters. Let’s limit our search to “Yersinia” species. To do this, type “Yersinia” in the “Organism” box inside the “Choose Search Set” section. Perform the BLAST search.
  13. Study the results. Are there any good matches within “Yersinia” species?
  14. Let’s repeat the search but this time we will limit the results to “Shigella” species.
  15. Study the results. Are there any good matches in Shigella?
  16. Repeat the BLAST search excluding all enterobacteria. To do this, type “enterobacteria” “Organism” box inside the “Choose Search Set” section and tick the “Exclude” option. Perform the BLAST search. What are your findings? Concentrate on the non-Salmonella entries.
  17. Based on the combined results, what can you conclude about WP_000161708.1 presence in other bacteria?
  18. Repeat the analysis using a nucleotide databases instead of protein databases. (hint: Go to the BLAST home page and look for a programme that allows to use a protein query against a translated nucleotide database). How do the results compare to the first part of this task? Limit / exclude species from your results as before.

