Measuring missing heritability: Inferring the contribution of common variants

被引:197
作者
Golan, David [1 ]
Lander, Eric S. [2 ,3 ,4 ]
Rosset, Saharon [1 ]
机构
[1] Tel Aviv Univ, Sch Math Sci, Dept Stat & Operat Res, IL-69978 Tel Aviv, Israel
[2] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[3] MIT, Dept Biol, Cambridge, MA 02139 USA
[4] Harvard Univ, Sch Med, Dept Syst Biol, Boston, MA 02155 USA
关键词
genome-wide association studies; statistical genetics; heritability estimation; GENOME-WIDE ASSOCIATION; HUMAN HEIGHT; SUSCEPTIBILITY LOCI; MULTIPLE-SCLEROSIS; SNPS; SCHIZOPHRENIA; PROPORTION; DISEASE; LINKAGE; MODELS;
D O I
10.1073/pnas.1419064111
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genome-wide association studies (GWASs), also called common variant association studies (CVASs), have uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability. One explanation for the "missing heritability" is that there are many additional disease-associated common variants whose effects are too small to detect with current sample sizes. It therefore is useful to have methods to quantify the heritability due to common variation, without having to identify all causal variants. Recent studies applied restricted maximum likelihood (REML) estimation to case-control studies for diseases. Here, we show that REML considerably underestimates the fraction of heritability due to common variation in this setting. The degree of underestimation increases with the rarity of disease, the heritability of the disease, and the size of the sample. Instead, we develop a general framework for heritability estimation, called phenotype correlation-genotype correlation (PCGC) regression, which generalizes the well-known Haseman-Elston regression method. We show that PCGC regression yields unbiased estimates. Applying PCGC regression to six diseases, we estimate the proportion of the phenotypic variance due to common variants to range from 25% to 56% and the proportion of heritability due to common variants from 41% to 68% (mean 60%). These results suggest that common variants may explain at least half the heritability for many diseases. PCGC regression also is readily applicable to other settings, including analyzing extreme-phenotype studies and adjusting for covariates such as sex, age, and population structure.
引用
收藏
页码:E5272 / E5281
页数:10
相关论文
共 40 条
[1]   Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20 [J].
Bahlo, Melanie ;
Booth, David R. ;
Broadley, Simon A. ;
Brown, Matthew A. ;
Foote, Simon J. ;
Griffiths, Lyn R. ;
Kilpatrick, Trevor J. ;
Lechner-Scott, Jeanette ;
Moscato, Pablo ;
Perreau, Victoria M. ;
Rubio, Justin P. ;
Scott, Rodney J. ;
Stankovich, Jim ;
Stewart, Graeme J. ;
Taylor, Bruce V. ;
Wiley, James ;
Clarke, Glynnis ;
Cox, Mathew B. ;
Csurhes, Peter A. ;
Danoy, Patrick ;
Drysdale, Karen ;
Field, Judith ;
Foote, Simon J. ;
Greer, Judith M. ;
Guru, Preethi ;
Hadler, Johanna ;
McMorran, Brendan J. ;
Jensen, Cathy J. ;
Johnson, Laura J. ;
McCallum, Ruth ;
Merriman, Marilyn ;
Merriman, Tony ;
Pryce, Karen ;
Tajouri, Lotfi ;
Wilkins, Ella J. ;
Browning, Brian L. ;
Browning, Sharon R. ;
Perera, Devindri ;
Butzkueven, Helmut ;
Carroll, William M. ;
Chapman, Caron ;
Kermode, Allan G. ;
Marriott, Mark ;
Mason, Deborah ;
Heard, Robert N. ;
Pender, Michael P. ;
Slee, Mark ;
Tubridy, Niall ;
Willoughby, Ernest .
NATURE GENETICS, 2009, 41 (07) :824-U84
[2]   Correcting for nonrandom ascertainment in generalized linear mixed models (GLMMs), fitted using Gibbs sampling [J].
Burton, PR .
GENETIC EPIDEMIOLOGY, 2003, 24 (01) :24-35
[3]   Estimating her tability of complex traits from genonne-wide association studies using IBS-based Haseman-Elston regression [J].
Chen, Guo-Bo .
FRONTIERS IN GENETICS, 2014, 5
[4]   REFINING GENETICALLY INFERRED RELATIONSHIPS USING TREELET COVARIANCE SMOOTHING [J].
Crossett, Andrew ;
Lee, Ann B. ;
Klei, Lambertus ;
Devlin, Bernie ;
Roeder, Kathryn .
ANNALS OF APPLIED STATISTICS, 2013, 7 (02) :669-690
[5]  
DEMPSTER ER, 1950, GENETICS, V35, P212
[6]   Web-Based Genome-Wide Association Study Identifies Two Novel Loci and a Substantial Genetic Component for Parkinson's Disease [J].
Do, Chuong B. ;
Tung, Joyce Y. ;
Dorfman, Elizabeth ;
Kiefer, Amy K. ;
Drabant, Emily M. ;
Francke, Uta ;
Mountain, Joanna L. ;
Goldman, Samuel M. ;
Tanner, Caroline M. ;
Langston, J. William ;
Wojcicki, Anne ;
Eriksson, Nicholas .
PLOS GENETICS, 2011, 7 (06)
[7]  
Elston RC, 2000, GENET EPIDEMIOL, V19, P1, DOI 10.1002/1098-2272(200007)19:1<1::AID-GEPI1>3.0.CO
[8]  
2-E
[9]  
Galton F., 1886, J. Anthropol. Inst. G. B. Irel, V15, P246, DOI 10.2307/2841583
[10]   Accurate estimation of heritability in genome wide studies using random effects models [J].
Golan, David ;
Rosset, Saharon .
BIOINFORMATICS, 2011, 27 (13) :I317-I323