Skip main navigation

Comparing Two FASTA Files

In this exercise we will compare two Staphylococcus aureus genomes in ACT and investigate the number of differences, also called synteny breaks. The first sequence we will download is the S.aureus genome strain called TW20. This is an antibiotic-resistant strain. Click here to download the GenBank entry.
© Wellcome Genome Campus Advanced Courses and Scientific Conferences

In this exercise we will compare two Staphylococcus aureus genomes in ACT and investigate the number of differences, also called synteny breaks.

The first sequence we will download is the S.aureus genome strain called TW20. This is an antibiotic-resistant strain. Click here to download the GenBank entry.

We will compare this genome to an antibiotic-sensitive strain called S.aureus MSSA476. Click here to download the GenBank entry.

To download the FASTA file, go to Send to on the right-hand side of the GenBank record, choose Complete Record, destination File and as Format choose FASTA.

When downloading files from GenBank, the name of the FASTA file is always sequence.fasta. Change the name of the file to more meaningful names, for example TW20.fasta and MSSA476.fasta.

If you cannot download the files from a public repository you can also download them from the following FTP site. The BlastN comparison file called TW20_vs_MSSA476.txt can also be downloaded from this site: ftp://ftp.sanger.ac.uk/pub/resources/coursesandconferences/Online_Courses/Course4/Week1/Step_1.15/
You may need to copy and paste the link in your internet browser. We recommend use of Chrome or Firefox browsers for downloading data files.

Please note: Most browsers no longer support downloading files from an FTP site. However, there are a few steps you can take to access the files. You will need to use an FTP client to view and download the files. You can use a free client like Cyberduck, available for Mac and Windows users.

We will now open the files in ACT. Double click on the ACT icon. Once the small ACT window is open choose the three following files and click Apply.

Sequence file 1: TW20.fasta

Comparison file TW20_vs_MSSA476.txt

Sequence file 2 MSSA476.fasta

Now follow the steps we’ve outlined in the previous section. Zoom out (marked with an arrow) to get an overview of the complete genomes. Take the slider from the comparison view panel (the one in the middle, marked with a circle) all the way down so you can eliminate low score similarities.

ACTscreenshot

Differences between the genomes are shown as white spaces in the comparison view panel. They are called synteny breaks.

Discuss how many synteny breaks you can observe.

Is one of the genomes bigger than the other? Discuss the possible reason behind this.

You can read more about the S.aureus genome TW20 in this publication. Can you identify the 127kb difference mentioned in the publication?

© Wellcome Genome Campus Advanced Courses and Scientific Conferences
This article is from the free online

Bacterial Genomes III: Comparative Genomics using Artemis Comparison Tool (ACT)

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education