So I’d like to welcome Julian Parkhill. He’s going to talk to us about bacterial genomes. So what do bacterial genomes actually look like? So they’re actually quite dull. They’re fairly nondescript. They look like a kind of mess of spaghetti. They’re just very, very long strings wrapped up into loops. But the important thing about bacterial genomes is not what they look like, not their visual appearance, but the information they contain. The information is stored in the order of the bases. And that storage of information, that digital storage of information, was one of the first big discoveries, I think, of the Molecular Age.
The idea that the information is not stored in the shape or the structure, but it’s stored in the order of the bases, the digital encoding, which is the key to how DNA stores such enormous amounts of information. What does that information mean? Within a bacterial genome, you have all of the instructions required to make the organism. So at the basic level, you have information about what proteins, the shape of the proteins, what the proteins do. You have information about how those are regulated, how they’re switched on and off. You have information about how the organism responds to changes in the environment. Or in the case of pathogens, the presence of a host.
So all of that information about how to build the organism, and how to control the organism, and how to respond to the changes in the environment is all encoded in the genome. What can that information tell us about how bacteria cause disease? So within the genome is all the information about the biology of the organism. And of course, in a pathogen, that biology includes interacting with a host. So if you look at a genome of a bacterial pathogen for the first time, you will see genes involved in virulence, genes involved in host interaction. So for example, if you look at the genome of diphtheria, you’ll find the diphtheria toxin.
If you look at the genome of pertussis, you’ll find the pertussis toxin. And those are obvious markers of disease. If you look in other pathogens, for example, salmonella, you can find a whole set of genes that are involved in interacting with the host, with binding to the host, with subverting the host. So all of these genes are encoded in the genome. And if you know what you’re looking for, you can get a good idea of how a pathogen works and what it does from the kind of genes that it encodes. What are the important ways in which genomes vary? The basic level of variation is point mutation, single-nucleotide changes, or SNPs.
On top of that, you have changes in structure, changes in copy number. But most importantly for bacterial genomes, which is how they differ from eukaryotic pathogens and ourselves, is that they vary by acquisition and loss of genes, by gene presence absence. And as you compare two bacterial genomes that are closely related, you’ll find very large numbers of genes that are present in one and absent in the other. And those genes can be single genes. Or they can be blocks of genes. But crucially, if you take two pathogens, or a pathogen and a non-pathogen, you’ll find that those additional genes– we call them accessory genes– are often where the differences in lifestyle are encoded.
So if you have two different pathogens causing slightly different diseases, it will often be the presence and absence of different genes that cause that. How can we use this variation to better understand diseases? One of the best ways of doing that is through comparative genomics. So take two E. colis, for example. An E. coli that causes hemorrhagic syndrome, or E. coli that causes urinary tract infections, and look at how they differ. Look at the genes that are present in one and absent in the other. And you’ll find genes that are involved in those different diseases.
We can compare pathogens to non- pathogens, and identify the genes that are involved in pathogenicity. Or for example, we can look at environmental organisms from which pathogens have recently emerged. And we can look at more fine-scale differences, what subtle changes have there been in proteins in regulation, in the way that it interacts with the host that have caused it to become pathogenic more recently? And all of that comes from comparing genomes of pathogens and non-pathogens or recently-emerged pathogens. How can genomics help us to defeat pathogens? So within the genome, all of the potential drug targets and all of the potential vaccine targets.
And by using, firstly, genomics and then comparative genomics, we can identify what are good targets for drugs, for developing novel drugs, or for re-targeting known drugs, or what are the targets of vaccines. And by looking at comparative genomics, we can understand variation. We can understand how they might avoid vaccines, and how we can tailor vaccines to treat them. That was extremely interesting, Julian. Thank you very much.