Traditional Culture Encyclopedia - Photography and portraiture - 20 19-02-23 QTL (transferred from Zhihu)

20 19-02-23 QTL (transferred from Zhihu)

Link:/Question /27695566/ Answer /4074 1777

I came to clear my mind by the way. Welcome friends to comment and correct me.

First, the basic idea of QTL mapping

The full name of QTL is quantitative trait locus. As the name implies, there are some loci on the genome, which have a certain quantitative influence on some specific traits. The key point here is that the trait we care about is a quantitative trait with many different levels, not a qualitative trait. The most typical example of the latter is disease (getting sick/not getting sick). Generally, any complex trait (determined by multiple genes) can be considered as a quantitative trait (I actually think it is strange to translate quantity into "quantity" here, but it is good for everyone to understand it when it is established), such as human height/weight/IQ, plant height/yield of crops and so on.

The basic principle of QTL mapping is to determine a certain quantitative trait (phenotype) of a group of individuals and its genotype (that is, some genetic markers in the genome, such as SNP/RFLP, etc. , but not necessarily the whole genome), and then find the corresponding relationship between genotype and phenotype. For example:

& ampamplt; img data-raw height = " 476 " data-raw width = " 68 1 " src = "/86 ad 0 f 0d 9849 1 fcb 0 df 999 be 79 be 07 e 3 _ b . jpg " class = " origin _ image zh-light box-thumb " width = " 68 1 " data-original = "/86 ad 0 f 0d 949 1 fcb 0 df 999 be 79 be 079ampgt;

In the figure, the x axis is the value of quantitative trait, and the y axis indicates how many individuals in the population have quantitative trait corresponding to the x value. So the bottom picture is site C, regardless of the status (? Cc or cc'), the quantitative loci all showed the same normal distribution. The genotype of B locus is related to quantitative traits. Because of bb? In genotype, the value of quantitative traits is relatively small, while in bb' genotype, the opposite is true. Then the correlation between locus A and quantitative trait value is more obvious. The essence of QTL mapping is to find out which genetic markers in the genome have the strongest correlation with quantitative traits. Note that what is said here is "correlation", which does not mean direct/decisive action, because site A may only be linked to the decisive gene. Therefore, after QTL mapping, it is necessary to design further experiments to verify/find genes/loci with genetic effects.

Second, the main factors affecting the results of QTL analysis

65438+

Generally speaking, several ancestors (with obvious differences in quantitative traits) are used for various crosses, and then their offspring are used to determine the phenotype and genotype, and finally the QTL is located through the correlation between the two types.

On the other hand, Association study generally captures a group of individuals (wild, with unknown genetic background) and directly measures phenotype and genotype, and then looks at the correlation between the two types. QTL mapping tends to use populations with clear genetic background (that is, those with few ancestors crossing), because the results of QTL will not be affected by population structure. For example, according to the method of correlation analysis, pulling a bunch of white and yellow people to find the QTL of height is estimated to be basically racial differences, which has nothing to do with height. (But association analysis also has its own advantages. For example, it can cover more genetic diversity than just those of ancestors. In addition, it doesn't mean that the data of association analysis can't be used for QTL mapping, just be very, very careful)

2. Statistical capacity

Variance analysis (ANOVA)/ generalized linear model (GLM) is a statistical method for QTL location. Statistical methods have a common bottleneck-statistical ability.

This bottleneck is particularly obvious today when the cost of gene sequencing is greatly reduced. In order to locate QTL accurately, people will use more and more genetic markers. As a result, there are too many hypothesis tests in statistics (one test for each marker), and once multiple tests are made to correct, there are few significant loci.

On the other hand, the interaction between sites is also very important, such as:

& ampamplt; img data-raw height = " 526 " data-raw width = " 600 " src = "/48d 6d 19 ecbd 78 Bab 2 1044 c 1 1 52 e802 c 0 _ b . jpg " class = " origin _ image zh-light box-thumb " width = " 600 " data-original = "/48d 6d 19 ecbd 78 Bab 265438ampgt;

Suppose that the stars in the picture are individuals with high quantitative traits and the circles are individuals with low quantitative traits. The x 1 and x2 axes represent the genotypes of two genetic markers. Obviously, these two loci are very related to quantitative traits when viewed together, but they are completely irrelevant when viewed alone on x 1 or x2. X 1 and x2 are two interactive QTLs. If there are 10000 loci in the genome, you have to do 1 100 million hypothesis tests to find this interaction. This is just the interaction between two sites. How about three?

Now many people are trying to solve these problems. Increasing the number of samples in the population is an obvious direction, but the benefits are not great. Some people also try to test the interaction from the perspective of genes [PLoS genetics: gene-based interaction test in quantitative trait association research]. Some people think that the interaction will not be as "clean" as the above figure (x 1 and x2 alone have no effect, but x 1 and x2 have strong interaction). Their idea is to find a few effective loci, and then look at the interaction between these loci and all other loci [PLoS Biology: multi-locus linkage analysis of genome wide expression in year].

3. Reorganization/linkage

Ideally, if all the genetic markers are not linked when mapping QTL, then you should be able to map QTL to a specific genetic marker accurately, because the QTL next door will not be related to quantitative traits. But in fact, adjacent genetic markers are likely to be linked, so a group of adjacent genetic markers seems to have a strong correlation with QTL-unless their linkage is interrupted by recombination. The main means to increase recombination is to improve hybrid algebra (too many generations is not enough, and the genome may become so chaotic that genetic markers cannot return to the original genome)

Thirdly, technical analysis derived from QTL.

1. From macro to micro

QTL was originally used to analyze macroscopic quantitative traits, such as human height and crop yield mentioned above. Nowadays, more and more people use it to analyze micro-level traits, such as eQTL (expression QTL), pQTL (protein QTL expression QTL), sQTL (splicing QTL) and so on.

2. Extreme QTL(xQTL)

[ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2862354/ ]

First, a hybridization process similar to general QTL mapping is used: the hybridization of two (or more) ancestors. After several generations, some individuals with extremely high quantitative traits were selected from a large number of offspring, and then the genotypes of these individuals were determined. Because after selection, those genotypes (alleles) that can improve quantitative traits will appear frequently in the selected parts of individuals. This is not the case for groups that are not selected. By comparing the frequency differences of different alleles in two populations, QTL can be located.