Resetting the bar: Statistical significance in whole-genome sequencing-based association studies of global populations

被引:49
作者
Pulit, Sara L. [1 ]
de With, Sera A. J. [2 ]
de Bakker, Paul I. W. [3 ,4 ]
机构
[1] Univ Med Ctr Utrecht, Dept Neurol, Brain Ctr Rudolf Magnus, Utrecht, Netherlands
[2] Univ Med Ctr Utrecht, Dept Psychiat, Brain Ctr Rudolf Magnus, Utrecht, Netherlands
[3] Univ Med Ctr Utrecht, Dept Genet, Ctr Mol Med, Utrecht, Netherlands
[4] Univ Med Ctr Utrecht, Dept Epidemiol, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
关键词
association studies; complex traits; genetics; multiple test correction; LOW-FREQUENCY; SIGNIFICANCE THRESHOLDS; DEMOGRAPHIC HISTORY; WIDE SIGNIFICANCE; VARIANTS; IDENTIFICATION; DISCOVERY; FRAMEWORK; COMMON; RISK;
D O I
10.1002/gepi.22032
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) of common disease have been hugely successful in implicating loci that modify disease risk. The bulk of these associations have proven robust and reproducible, in part due to community adoption of statistical criteria for claiming significant genotype-phenotype associations. As the cost of sequencing continues to drop, assembling large samples in global populations is becoming increasingly feasible. Sequencing studies interrogate not only common variants, as was true for genotyping-based GWAS, but variation across the full allele frequency spectrum, yielding many more (independent) statistical tests. We sought to empirically determine genome-wide significance thresholds for various analysis scenarios. Using whole-genome sequence data, we simulated sequencing-based disease studies of varying sample size and ancestry. We determined that future sequencing efforts in >2,000 samples of European, Asian, or admixed ancestry should set genome-wide significance at approximately P = 5 x 10(-9), and studies of African samples should apply a more stringent genome-wide significance threshold of P = 1 x 10(-9). Adoption of a revised multiple test correction will be crucial in avoiding irreproducible claims of association.
引用
收藏
页码:145 / 151
页数:7
相关论文
共 43 条
[11]   Whole-genome sequence variation, population structure and demographic history of the Dutch population [J].
Francioli, Laurent C. ;
Menelaou, Andronild ;
Pulit, Sara L. ;
Van Dijk, Freerk ;
Palamara, Pier Francesco ;
Elbers, Clara C. ;
Neerincx, Pieter B. T. ;
Ye, Kai ;
Guryev, Victor ;
Kloosterman, Wigard P. ;
Deelen, Patrick ;
Abdellaoui, Abdel ;
Van Leeuwen, Elisabeth M. ;
Van Oven, Mannis ;
Vermaat, Martijn ;
Li, Mingkun ;
Laros, Jeroen F. J. ;
Karssen, Lennart C. ;
Kanterakis, Alexandros ;
Amin, Najaf ;
Hottenga, Jouke Jan ;
Lameijer, Eric-Wubbo ;
Kattenberg, Mathijs ;
Dijkstra, Martijn ;
Byelas, Heorhiy ;
Van Settenl, Jessica ;
Van Schaik, Barbera D. C. ;
Bot, Jan ;
Nijman, Isaac J. ;
Renkens, Ivo ;
Marscha, Tobias ;
Schonhuth, Alexander ;
Hehir-Kwa, Jayne Y. ;
Handsaker, Robert E. ;
Polak, Paz ;
Sohail, Mashaal ;
Vuzman, Dana ;
Hormozdiari, Fereydoun ;
Van Enckevort, David ;
Mei, Hailiang ;
Koval, Vyacheslav ;
Moed, Ma-Tthijs H. ;
Van der Velde, K. Joeri ;
Rivadeneira, Fernando ;
Estrada, Karol ;
Medina-Gomez, Carolina ;
Isaacs, Aaron ;
McCarroll, Steven A. ;
Beekrnan, Marian ;
De Craen, Anton J. M. .
NATURE GENETICS, 2014, 46 (08) :818-825
[12]   Sequencing studies in human genetics: design and interpretation [J].
Goldstein, David B. ;
Allen, Andrew ;
Keebler, Jonathan ;
Margulies, Elliott H. ;
Petrou, Steven ;
Petrovski, Slave ;
Sunyaev, Shamil .
NATURE REVIEWS GENETICS, 2013, 14 (07) :460-470
[13]   Interpreting the role of de novo protein-coding mutations in neuropsychiatric disease [J].
Gratten, Jacob ;
Visscher, Peter M. ;
Mowry, Bryan J. ;
Wray, Naomi R. .
NATURE GENETICS, 2013, 45 (03) :234-238
[14]   The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity [J].
Grimm, Dominik G. ;
Azencott, Chloe-Agathe ;
Aicheler, Fabian ;
Gieraths, Udo ;
MacArthur, Daniel G. ;
Samocha, Kaitlin E. ;
Cooper, David N. ;
Stenson, Peter D. ;
Daly, Mark J. ;
Smoller, Jordan W. ;
Duncan, Laramie E. ;
Borgwardt, Karsten M. .
HUMAN MUTATION, 2015, 36 (05) :513-523
[15]   The $1,000 genome [J].
Hayden, Erika Check .
NATURE, 2014, 507 (7492) :294-295
[16]   A comprehensive review of genetic association studies [J].
Hirschhorn, JN ;
Lohmueller, K ;
Byrne, E ;
Hirschhorn, K .
GENETICS IN MEDICINE, 2002, 4 (02) :45-61
[17]   Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set [J].
Kanai, Masahiro ;
Tanaka, Toshihiro ;
Okada, Yukinori .
JOURNAL OF HUMAN GENETICS, 2016, 61 (10) :861-866
[18]   Exome sequencing and the genetic basis of complex traits [J].
Kiezun, Adam ;
Garimella, Kiran ;
Do, Ron ;
Stitziel, Nathan O. ;
Neale, Benjamin M. ;
McLaren, Paul J. ;
Gupta, Namrata ;
Sklar, Pamela ;
Sullivan, Patrick F. ;
Moran, Jennifer L. ;
Hultman, Christina M. ;
Lichtenstein, Paul ;
Magnusson, Patrik ;
Lehner, Thomas ;
Shugart, Yin Yao ;
Price, Alkes L. ;
de Bakker, Paul I. W. ;
Purcell, Shaun M. ;
Sunyaev, Shamil R. .
NATURE GENETICS, 2012, 44 (06) :623-630
[19]   Power of deep, all-exon resequencing for discovery of human trait genes [J].
Kryukov, Gregory V. ;
Shpunt, Alexander ;
Stamatoyannopoulos, John A. ;
Sunyaev, Shamil R. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (10) :3871-3876
[20]   Rare-Variant Association Analysis: Study Designs and Statistical Tests [J].
Lee, Seunggeung ;
Abecasis, Goncalo R. ;
Boehnke, Michael ;
Lin, Xihong .
AMERICAN JOURNAL OF HUMAN GENETICS, 2014, 95 (01) :5-23