Want to keep learning?

This content is taken from the Pompeu Fabra University Barcelona's online course, Why Biology Matters: Basic Concepts. Join the course to learn more.

Skip to 0 minutes and 26 seconds Today we are going to talk on how evolution works and understand the process and the reconstruction of evolution. And we have with us Toni Gabaldón. Toni Gabaldón is an ICREA research professor at the CRG, Center for Genome Regulation, in Barcelona. He’s very interested in reconstructing evolution, and he’s doing much of comparative genomics and reconstruction of many processes in evolution through genomic information. He’s interested in the study of the origin, evolution, and function of complex biological systems. He has done a lot of interesting work trying to reconstruct the evolutionary process in several clades of life. Toni, why comparing genomes is a key to reconstruct the evolutionary processes?

Skip to 1 minute and 25 seconds Okay, you have to think that the genomes are molecules that are passed over generations. So every time a cell divides, this material is copied. But in this copying process, there are sometimes errors that are accumulated. And some of these errors can be selected against, or selected for, but some just stay lying around. So as time passes, the number of these changes increases. So when you compare genomes from different individuals, or from different species, you will see differences that are proportional to the separation of these two organisms. So in fact, you can use this to reverse, to try to reconstruct, what has been the process of creation of these differences.

Skip to 2 minutes and 17 seconds So at the very end, you see the diversity of life looking at the diversity of the genomes. Exactly. All living forms have this genetic material, they have these genomes, and they have accumulated these differences over time. So that’s a way of assessing the diversity of life and reconstructing the relationships between the different species and organisms. You see the diversity of life, but you also see the commonality, the fact that all life shares lots of things. And you can see that in the genomes. Exactly. So the very fact that we can compare genomes from one species and another species is because they have a common origin. So they all use the same material to store their genetic information, the DNA.

Skip to 3 minutes and 8 seconds They have many genes in common, so genes of the same family that you find across the diversity of life… so we have some proteins encoded by genes which are of the same families of proteins that are encoded by bacteria, even if they live in diverse environments. And this is because all life on Earth is related through a common origin. So you could eventually trace all back to a single ancestor that is related to the whole diversity of life. So maybe we should not be surprised if, when people sequence a new genome, they say we share a very high percentage of genes. Exactly. Because we share an ancestry. So therefore, we share some of the genetic information.

Skip to 4 minutes and 1 second So we should not be surprised to find that we share some genes like are similar to those that are in bacteria or other forms of life. Even if we think we are enormously different. And of course we are, but our basic mechanisms are pretty conserved. When you begin one of your projects, say, the diversity of birds, or fungi, how do you proceed? How do you begin, and what is the process you follow? So the first step is to…you have a group of organisms, like a clade, a group of related organisms that you want to reconstruct the evolution; which species is closest to which one, and so on.

Skip to 4 minutes and 46 seconds So you want, in a way, to reconstruct the tree of life of that clade. And the first step is always to find the genes that are common to all these species. So you have to set which regions of the genome are homologous to which other regions in the other species. So in a way, in simple words, this is to know which gene is what gene in each of the species. And once you have gathered this information, you try to find as many of these genes as possible, because these are genes that are shared by all the species you are studying, and in principle, you could reconstruct from these genes the underlying evolution of a species.

Skip to 5 minutes and 32 seconds The more genes you can gather, the more information you have. Because a single gene is limited in their ability to reconstruct the evolution of other species. But the more you approach the whole genome, the more information you can gather together. So that’s the first step. Then you align these sequences with each other to find these mutations - or changes in the sequence - that should be proportional to the evolutionary relationship. So you count these differences, and you assess where are these differences, and using some mathematical models that implement what we know about how sequences evolve, you can use those models to reconstruct back the evolution of a species.

Skip to 6 minutes and 21 seconds So you end up with a tree representing the common ancestry of all these species. This tree is what you call a phylogeny then? Yes, this is what we call a species phylogeny. Because you can also have a tree which represents the evolution of a gene family. This will be a gene phylogeny. But when you combine several of these gene phylogenies into one single phylogeny, this is a close approximation of what has been the evolution of this species, and this is what we call the species phylogeny. And here you reconstruct not only how the evolution of the group went, let’s say the mode, the clades, but you also reconstruct the time. Is it possible to talk about an evolutionary clock?

Skip to 7 minutes and 10 seconds I mean, the molecular clock was a very interesting, inspiring idea that if these mutations, or if these changes accumulate in a way that is proportional to time, then we could time using these trees past events and know when they happened. So the reality is not as simple. There is not a single molecular clock. But there is a relationship between the number of differences and time. This relationship is not direct, so we cannot just go from counting the number of changes to counting the number of years, or millions of years. Because there are different factors, like we have seen now that different parts of the genome evolve at different speeds.

Skip to 7 minutes and 57 seconds So you can have some proteins that evolve much faster than others in the same gene. So they tend to accumulate mutations more rapidly. We have also seen that this clock is not ticking at the same speed in different parts of the tree of life. Because, you can imagine that the generation time, for instance, of the different species is different. So you copy more times the genome in one species that have short generation times and reproduce every year than in a species that reproduces every 20 years. And also the correction mechanisms and the mutation rates in each species may be different. However, considering all these factors, the number of mutations is telling us, in part, about the time.

Skip to 8 minutes and 47 seconds So there are ways in which we can approximate this time. We can use fossil records that we can calibrate in different groups of species and we can understand these factors. So in a way, we can assess, at least in relative timing, what things happened before or after others. So could you say that we are in the process of the full reconstruction of the tree of life? I mean, to me, I see the full reconstruction of the tree of life as utopia, you know? It is our goal, we have to go there, however it is not as easy. So the first problem I see is that the diversity we know is still a minor fraction of true diversity.

Skip to 9 minutes and 33 seconds And we are now realizing with these projects that just go to the deep sea and get cells and sequence the genomes… We are understanding now that the diversity we know of is just a tiny fraction of true diversity. So if the tree has to be the tree of life, not just the tree of the species we know the best, like vertebrates or plants, I think this will take time just to gather the genetic information we need. Then I also think there is a limitation to how far we can go in the reconstruction of the tree of life, because some of the events are very ancestral.

Skip to 10 minutes and 9 seconds So life originated on Earth 3 billion years ago, and you can accumulate mutations, but then you have the effect that some mutations will accumulate over some preexisting mutations, and there is an effect that is called “saturation of signal”. So for the very ancient events, the saturation of signal can blur this relationship. So I think there’s a limit, and we may reach the limit at some point, and we have to be happy that, okay, maybe we don’t know the exact relationship between these very ancestral groups. Which is a more difficult part? Bacteria, maybe? Or protozoans? I think the basal radiation of some groups is going to be very problematic. Like, I’m thinking now about the origin of eukaryotes.

Skip to 11 minutes and 7 seconds So eukaryotes quickly - from the first eukaryotes we can recognize – apparently they quickly diversified into several distinct groups. So I think clarifying the order in which these groups separated from each other is going to be very challenging.

Skip to 11 minutes and 25 seconds But I also want to say something: sometimes you have these problems in very recent variations; and you mentioned our work on the birds. So theirs is a more recent event, much more recent than the origin of eukaryotes. But it was also very fast radiation. And we participated in this study in which we had the full genomes of 50 different species samples from the different bird lineages, and indeed there it was not very simple to reconstruct the tree of life. And the problem was that it was a very fast radiation because we think it coincided with the extinction of the dinosaurs. So this group could diversify very fast and occupy different niches.

Skip to 12 minutes and 9 seconds So when you have a lot of speciations in very short time, you don’t have much time to accumulate differences between one group and the other. So we think that that can be a complication also in some recent variations. So when things happen very fast, and you don’t have time to accumulate a lot of mutations, even the whole genome may not be enough to solve all the questions. So we have seen how the genome has evolved and how now we use the diversity of the genomes to reconstruct evolution; something extremely interesting and powerful, but, as we have seen, with lots of future challenges.

Conversation with Toni Gabaldón

Toni Gabaldón is an ICREA research professor at the CRG, the Center for Genome Regulation in Barcelona.

We are going to talk about how evolution works and explore the process and the reconstruction of evolution.

Important concepts from the video

1. Clade (4.37 and many other times)

A clade is a group of organisms that consists of a common ancestor and all its lineal descendants, and represents a single “branch” on the tree of life. It may refer to a narrow clade of, say, all Drosophila species or to a wide one, like animals. By definition it is monophyletic, that it contains one ancestor (which can be an organism, a population, or a species) and all its descendants. The ancestor can be known or unknown; any and all members of a clade can be extant or extinct.

2. Gene family (3.10)

A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions. One such family are the genes for human hemoglobin (Hb) subunits; the ten genes are in two clusters on different chromosomes, called the α-globin and β-globin loci. These two gene clusters are thought to have arisen as a result of a precursor gene being duplicated approximately 500 million years ago.

3. Tree of life (4.48)

Metaphor to indicate the evolutionary relationships among all living beings. It may be explored in an interesting web site: the Tree of Life Web Project http://tolweb.org/tree/

4. Homologous genes (5.00)

Genes that have a common ancestor; they have shared ancestry and usually they have similar function. Two segments of DNA can have shared ancestry because of either a speciation event (orthologs; in the genome of different species) or a duplication event (paralogs; in the same genome). Homology among DNA is inferred from their sequence similarity.

5. Phylogeny (6.24)

A phylogeny, phylogenetic tree or evolutionary tree is a branching diagram or “tree” showing the inferred evolutionary relationships among various biological species based upon similarities and differences in their physical or genetic characteristics. The taxa joined together in the tree are implied to have descended from a common ancestor.

6. Molecular clock (7.06)

It is a technique that uses the mutation rate of DNA to deduce time in past evolutionary events. It may be used for gene trees or, more interesting, to species trees. In many cases it is necessary to calibrate the clock with external evidence, like the fossil record. In this case it may be possible to have an estimation of the molecular clock of a given number of mutations per time unit (a million of years, for example).

7. Saturation of the signal (10.25)

Saturation of the number of substitutions observed when comparing two DNA sequences occurs when a single site experiments multiple mutations. Then, with time, the observed differences increases less than the amount of produced differences. In next figure, two different types of substitutions are shown, transitions (high mutation rate) and transversions (low).

8. Radiation (10.50)

An evolutionary radiation is a sudden increase in taxonomic diversity in a given clade. Radiations may affect one clade or many, and be rapid or gradual; where they are rapid, and driven by a single lineage’s adaptation to their environment, they are termed adaptive radiations. Perhaps the most familiar example of an evolutionary radiation is that of placental mammals immediately after the extinction of the dinosaurs at the end of the Cretaceous, about 66 million years ago. At that time, the placental mammals were mostly small, insect-eating animals similar in size and shape to modern shrews. By the Eocene (58–37 million years ago), they had evolved into such diverse forms as bats, whales, and horses.

9. Ecological niche (11.02)

In ecology, a niche is the fit of a species living under specific environmental conditions. The ecological niche describes how an organism or population responds to the distribution of resources and competitors and how it in turn alters those same factors. Sometimes it may be seen as a multidimensional space in an ecological system, with the environmental conditions and resources.

10. Speciation (12.08)

Speciation is the evolutionary process by which biological populations evolve to become distinct species. Thus it would mean the formation of barriers to reproduction due to increase differentiation among the original groups.

Share this video:

This video is from the free online course:

Why Biology Matters: Basic Concepts

Pompeu Fabra University Barcelona