What is Comparative Genomics
Put simply, comparative genomics is the comparison of two or more genome sequences
This allows researchers to identify sequences of DNA that are shared, or ‘conserved’, between these genomes. Likewise, sequences that are found in only a single genome, or a subset of genomes, can also be identified. Sequences that resemble one another, but are not totally conserved, can also be detected using comparative genomics.
Comparing sequences to determine how similar or different they are can inform a researcher about how ‘important’ that sequence is to a genome or an organism’s functioning. For example, if the genomes of several pathogens are compared, there will be genes and sequences common to some, or all, of them. Genes present in multiple genomes are probably required for that pathogen to function, and their ubiquity can indicate that their function is fundamental to the survival of the organism. For instance, the genes encoding proteins that participate in core processes such as energy metabolism, transcription, and translation are well-conserved between bacteria that are otherwise distantly related. This is visible using comparative genomics.
Sequences that are unique to an organism’s genome can be important for its unique characteristics. For instance, imagine that of a collection of 100 E. coli isolates, just one is resistant to a new class of antibiotic. Comparative genomics might identify a sequence of the drug-resistant isolate’s genome which is not found in any of the 99 drug-sensitive isolates. This unique sequence could well be a gene that confers the drug-resistance phenotype on that single strain.
In the case of bacterial genomes, it is also often useful to be able to visualise whether genes have been acquired or lost in a sequence of interest, relative to a reference genome. Comparative genomics is convenient for doing these sorts of checks; it allows a researcher to see whether a sequence in a reference is present or absent in a second genome, as well as giving positional information (e.g., whether the sequence on either ‘side’ of the sequence of interest is preserved). For instance, when a bacteriophage or a transposon integrates into a specific site in a bacterial genome, such an event can usually be observed easily in a genome sequence as a large stretch of DNA present between two otherwise-conserved sequences.
These concepts may seem a little abstract, but as we continue with the week and begin to use the Artemis Comparison Tool (ACT), you will have opportunities to see all of these examples using real data.
© Wellcome Genome Campus Advanced Courses and Scientific Conferences