Skip main navigation

How to identify the disease-causing locus in DNA sequence

How to identify the disease-causing locus in DNA sequence
Hello everyone! My name is Wen-Chang Wang. The topic of this section is how to identify the disease-causing locus in DNA sequence. DNA sequence is the carrier of genetic information. The length of human DNA sequence is more than 3 billion base pairs. Many disease are caused, at least in part, by some alterations in DNA sequence. Identification of the disease causing Locus is important to the prevention and treatment for disease. Linkage and association analysis are the major tools of this task. In this section, we will introduce the fundamental concepts of these two approaches.
In human cell, the DNA sequence is located on a set of 23 chromosomes including 22 autosomes and 1 sex chromosome. A regular human cell contains 2 such sets which are transmitted from the father and mother, respectively. There are two stages in the transmission process of DNA sequence from parent to offspring. Firstly, in meiosis, a regular cell is divided into four germ cells which are called gametes. Each gamete contains a single set of 23 chromosomes. Secondly, in fertilization, one paternal and one maternal gamete fuse to form a zygote which contains 2 sets of chromosomes again.
Meiosis is a special form of cell division and it is the foundation in genetic analysis. In a regular cell, the matching paternal and maternal chromosomes are called homologous chromosomes. In the first step of meiosis, each chromosome is duplicated. Next, exchanges of material between homologous chromosomes are possible. This random phenomenon is called crossing over. Then two rounds of cell division proceed subsequently. Finally, four gametes are produced and one of them is transmitted randomly to the offspring.
Let ApAm and BpBm denote the genotypes of a subject at two loci A and B on the chromosome. The subscripts p and m indicate the paternal and maternal alleles, respectively. On the gamete generated from the meiosis, there are 4 possible combinations of alleles. When a gamete carries alleles transmitted from both parents, such as case 3 and case 4, we say there is a recombination. The probability of obtaining a recombination is called the recombination fraction and it depends on the locations of A and B. If A and B are on different chromosomes, then these 4 cases have equal probabilities and the recombination fraction is equal to 0.5.
If A and B are on the same chromosome, the recombination could occur due to the crossing over during meiosis. In this situation, the recombination fraction ranges between 0 and 0.5 and the shorter the distance between the loci, the smaller the corresponding recombination fraction. When the recombination fraction is less that 0.5, the two loci are said to be in linkage.
The main purpose of linkage analysis is to look for the chromosomal region harboring the disease locus. To implement the linkage analysis, in families with multiple patients, we collect the affection status of the family members and their genotypes at many markers across the whole genome. A marker is a locus whose location on the chromosome is known and there are different detectable alleles at that locus.
Based on the family data, we can derive the transmission pattern of alleles at the markers from parents to offspring and investigate which markers are in linkage with the putative disease locus. Then, the regions spanned by the linked markers could harbor the disease locus. To identify the location of the disease locus, we need further fine mapping of the candidate region by applying association analysis.
Association analysis is based on the concept of linkage disequilibrium which describes the dependence of alleles at different loci in the population and is usually abbreviated as LD.
LD can be caused by mutation. Let’s consider a marker M on the chromosome. Suppose an allele D1 arose by mutation at a locus D near the marker M and this mutation occurred on a chromosome carrying the M1 allele at marker M. When the mutation just occurred, D1 and M1 are in complete LD because any chromosome carrying D1 must carry M1. However, the recombination during meiosis could generate gametes carrying D1 at Locus D and other alleles at marker M and the extent of LD between D1 and M1 will decay with time.
As shown in the figure, the speed of decay depends on the recombination fraction between the loci. Therefore, a strong LD observed indicates that either the distance between the loci is small or the LD occurred recently. Furthermore, if D1 and M1 are in LD and D1 is a disease causing variant, then affected subjects are more likely to carry M1 allele than unaffected subjects. Therefore, in association analysis, a marker is said to be associated with the disease if its allelic or genotypic frequencies are significantly different between affected and unaffected subjects. Once an association is found, the disease-causing locus could be close to the associated marker.

In a human cell, DNA sequences are located on a set of 23 chromosomes including 22 autosomes and 1 sex chromosome. A regular human cell contains two such sets which are transmitted from the father and mother respectively.

In this video, Prof. Wang will explain the steps of DNA sequence transmission during meiosis and fertilization. Moreover, how to identify them, especially the disease-causing ones.


Prof. Wen-Chang Wang

This article is from the free online

Introduction to Translational Research: Connecting Scientists and Medical Doctors

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education