Skip main navigation

Comparing Two FASTA Files

In this exercise we will compare two Staphylococcus aureus genomes in ACT and investigate the number of differences, also called synteny breaks. The first sequence we will download is the S.aureus genome strain called TW20. This is an antibiotic-resistant strain. Click here to download the GenBank entry.
© Wellcome Genome Campus Advanced Courses and Scientific Conferences

In this exercise we will compare two Staphylococcus aureus genomes in ACT and investigate the number of differences, also called synteny breaks.

The first sequence we will download is the S.aureus genome strain called TW20. This is an antibiotic-resistant strain. Click here to download the GenBank entry.

We will compare this genome to an antibiotic-sensitive strain called S.aureus MSSA476. Click here to download the GenBank entry.

To download the FASTA file, go to Send to on the right-hand side of the GenBank record, choose Complete Record, destination File and as Format choose FASTA.

When downloading files from GenBank, the name of the FASTA file is always sequence.fasta. Change the name of the file to more meaningful names, for example TW20.fasta and MSSA476.fasta.

If you cannot download the files from a public repository you can also download them from the following FTP site. The BlastN comparison file called TW20_vs_MSSA476.txt can also be downloaded from this site: ftp://ftp.sanger.ac.uk/pub/resources/coursesandconferences/Online_Courses/Course4/Week1/Step_1.15/
You may need to copy and paste the link in your internet browser. We recommend use of Chrome or Firefox browsers for downloading data files.

Please note: Most browsers no longer support downloading files from an FTP site. However, there are a few steps you can take to access the files. You will need to use an FTP client to view and download the files. You can use a free client like Cyberduck, available for Mac and Windows users.

We will now open the files in ACT. Double click on the ACT icon. Once the small ACT window is open choose the three following files and click Apply.

Sequence file 1: TW20.fasta

Comparison file TW20_vs_MSSA476.txt

Sequence file 2 MSSA476.fasta

Now follow the steps we’ve outlined in the previous section. Zoom out (marked with an arrow) to get an overview of the complete genomes. Take the slider from the comparison view panel (the one in the middle, marked with a circle) all the way down so you can eliminate low score similarities.

ACTscreenshot

Differences between the genomes are shown as white spaces in the comparison view panel. They are called synteny breaks.

Discuss how many synteny breaks you can observe.

Is one of the genomes bigger than the other? Discuss the possible reason behind this.

You can read more about the S.aureus genome TW20 in this publication. Can you identify the 127kb difference mentioned in the publication?

© Wellcome Genome Campus Advanced Courses and Scientific Conferences
This article is from the free online

Bacterial Genomes III: Comparative Genomics using Artemis Comparison Tool (ACT)

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now