Skip to 0 minutes and 7 seconds Hello everyone! My name is Wen-Chang Wang. The topic of this section is how to identify the disease-causing locus in DNA sequence. DNA sequence is the carrier of genetic information. The length of human DNA sequence is more than 3 billion base pairs. Many disease are caused, at least in part, by some alterations in DNA sequence. Identification of the disease causing Locus is important to the prevention and treatment for disease. Linkage and association analysis are the major tools of this task. In this section, we will introduce the fundamental concepts of these two approaches.
Skip to 1 minute and 5 seconds In human cell, the DNA sequence is located on a set of 23 chromosomes including 22 autosomes and 1 sex chromosome. A regular human cell contains 2 such sets which are transmitted from the father and mother, respectively. There are two stages in the transmission process of DNA sequence from parent to offspring. Firstly, in meiosis, a regular cell is divided into four germ cells which are called gametes. Each gamete contains a single set of 23 chromosomes. Secondly, in fertilization, one paternal and one maternal gamete fuse to form a zygote which contains 2 sets of chromosomes again.
Skip to 2 minutes and 7 seconds Meiosis is a special form of cell division and it is the foundation in genetic analysis. In a regular cell, the matching paternal and maternal chromosomes are called homologous chromosomes. In the first step of meiosis, each chromosome is duplicated. Next, exchanges of material between homologous chromosomes are possible. This random phenomenon is called crossing over. Then two rounds of cell division proceed subsequently. Finally, four gametes are produced and one of them is transmitted randomly to the offspring.
Skip to 2 minutes and 59 seconds Let ApAm and BpBm denote the genotypes of a subject at two loci A and B on the chromosome. The subscripts p and m indicate the paternal and maternal alleles, respectively. On the gamete generated from the meiosis, there are 4 possible combinations of alleles. When a gamete carries alleles transmitted from both parents, such as case 3 and case 4, we say there is a recombination. The probability of obtaining a recombination is called the recombination fraction and it depends on the locations of A and B. If A and B are on different chromosomes, then these 4 cases have equal probabilities and the recombination fraction is equal to 0.5.
Skip to 4 minutes and 8 seconds If A and B are on the same chromosome, the recombination could occur due to the crossing over during meiosis. In this situation, the recombination fraction ranges between 0 and 0.5 and the shorter the distance between the loci, the smaller the corresponding recombination fraction. When the recombination fraction is less that 0.5, the two loci are said to be in linkage.
Skip to 4 minutes and 48 seconds The main purpose of linkage analysis is to look for the chromosomal region harboring the disease locus. To implement the linkage analysis, in families with multiple patients, we collect the affection status of the family members and their genotypes at many markers across the whole genome. A marker is a locus whose location on the chromosome is known and there are different detectable alleles at that locus.
Skip to 5 minutes and 26 seconds Based on the family data, we can derive the transmission pattern of alleles at the markers from parents to offspring and investigate which markers are in linkage with the putative disease locus. Then, the regions spanned by the linked markers could harbor the disease locus. To identify the location of the disease locus, we need further fine mapping of the candidate region by applying association analysis.
Skip to 6 minutes and 5 seconds Association analysis is based on the concept of linkage disequilibrium which describes the dependence of alleles at different loci in the population and is usually abbreviated as LD.
Skip to 6 minutes and 22 seconds LD can be caused by mutation. Let’s consider a marker M on the chromosome. Suppose an allele D1 arose by mutation at a locus D near the marker M and this mutation occurred on a chromosome carrying the M1 allele at marker M. When the mutation just occurred, D1 and M1 are in complete LD because any chromosome carrying D1 must carry M1. However, the recombination during meiosis could generate gametes carrying D1 at Locus D and other alleles at marker M and the extent of LD between D1 and M1 will decay with time.
Skip to 7 minutes and 18 seconds As shown in the figure, the speed of decay depends on the recombination fraction between the loci. Therefore, a strong LD observed indicates that either the distance between the loci is small or the LD occurred recently. Furthermore, if D1 and M1 are in LD and D1 is a disease causing variant, then affected subjects are more likely to carry M1 allele than unaffected subjects. Therefore, in association analysis, a marker is said to be associated with the disease if its allelic or genotypic frequencies are significantly different between affected and unaffected subjects. Once an association is found, the disease-causing locus could be close to the associated marker.
How to identify the disease-causing locus in DNA sequence
In a human cell, DNA sequences are located on a set of 23 chromosomes including 22 autosomes and 1 sex chromosome. A regular human cell contains two such sets which are transmitted from the father and mother respectively.
In this video, Prof. Wang will explain the steps of DNA sequence transmission during meiosis and fertilization. Moreover, how to identify them, especially the disease-causing ones.
Prof. Wen-Chang Wang