Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)

被引:32
作者
de los Campos, Gustavo [1 ,2 ]
Sorensen, Daniel Alberto [3 ]
Angel Toro, Miguel [4 ]
机构
[1] Michigan State Univ, IQ Inst Quantitat Hlth Sci & Engn, Epidemiol & Biostat Dept, E Lansing, MI 48824 USA
[2] Michigan State Univ, IQ Inst Quantitat Hlth Sci & Engn, Stat & Probabil Dept, E Lansing, MI 48824 USA
[3] Aarhus Univ, Fac Sci & Technol, Dept Mol Biol & Genet, Aarhus, Denmark
[4] Univ Politecn Madrid, Prod Anim, Madrid, Spain
关键词
epistasis; apparent epistasis; phantom epistasis; GWAS; linkage disequilibrium; imperfect LD; missing heritability; Big Data; HERITABILITY;
D O I
10.1534/g3.119.400101
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epistatic interactions between DNA markers. However, these problems have not been formalized nor have their consequences been quantified in a precise manner. Here we use a conceptually simple three locus model involving a causal locus and two markers to show that imperfect LD can generate the illusion of epistasis, even when the underlying genetic architecture is purely additive. We describe necessary conditions for such "phantom epistasis" to emerge and quantify its relevance using simulations. Our empirical results demonstrate that phantom epistasis can be a very serious problem in GWAS studies (with rejection rates against the additive model greater than 0.28 for nominal p-values of 0.05, even when the model is purely additive). Some studies have sought to avoid this problem by only testing interactions between SNPs with R-sq. <0.1. We show that this threshold is not appropriate and demonstrate that the magnitude of the problem is even greater with large sample size, intermediate allele frequencies, and when the causal locus explains a large amount of phenotypic variance. We conclude that caution must be exercised when interpreting GWAS results derived from very large data sets showing strong evidence in support of epistatic interactions between markers.
引用
收藏
页码:1429 / 1436
页数:8
相关论文
共 24 条
[1]  
[Anonymous], 2012, R LANG ENV STAT COMP
[2]   A perspective on interaction effects in genetic association studies [J].
Aschard, Hugues .
GENETIC EPIDEMIOLOGY, 2016, 40 (08) :678-688
[3]   EFFECT OF SELECTION ON GENETIC VARIABILITY [J].
BULMER, MG .
AMERICAN NATURALIST, 1971, 105 (943) :201-+
[4]   Detecting gene-gene interactions that underlie human diseases [J].
Cordell, Heather J. .
NATURE REVIEWS GENETICS, 2009, 10 (06) :392-404
[5]   Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans [J].
Cordell, HJ .
HUMAN MOLECULAR GENETICS, 2002, 11 (20) :2463-2468
[6]   Genomic Heritability: What Is It? [J].
de los Campos, Gustavo ;
Sorensen, Daniel ;
Gianola, Daniel .
PLOS GENETICS, 2015, 11 (05)
[7]   Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor [J].
de los Campos, Gustavo ;
Vazquez, Ana I. ;
Fernando, Rohan ;
Klimentidis, Yann C. ;
Sorensen, Daniel .
PLOS GENETICS, 2013, 9 (07)
[8]   Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods [J].
de los Campos, Gustavo ;
Gianola, Daniel ;
Rosa, Guilherme J. M. ;
Weigel, Kent A. ;
Crossa, Jose .
GENETICS RESEARCH, 2010, 92 (04) :295-308
[9]   Do Molecular Markers Inform About Pleiotropy? [J].
Gianola, Daniel ;
de los Campos, Gustavo ;
Toro, Miguel A. ;
Naya, Hugo ;
Schoen, Chris-Carolin ;
Sorensen, Daniel .
GENETICS, 2015, 201 (01) :23-29
[10]   Genomic selection: prediction of accuracy and maximisation of long term response [J].
Goddard, Mike .
GENETICA, 2009, 136 (02) :245-257