Resetting the bar: Statistical significance in whole-genome sequencing-based association studies of global populations

被引:49
作者
Pulit, Sara L. [1 ]
de With, Sera A. J. [2 ]
de Bakker, Paul I. W. [3 ,4 ]
机构
[1] Univ Med Ctr Utrecht, Dept Neurol, Brain Ctr Rudolf Magnus, Utrecht, Netherlands
[2] Univ Med Ctr Utrecht, Dept Psychiat, Brain Ctr Rudolf Magnus, Utrecht, Netherlands
[3] Univ Med Ctr Utrecht, Dept Genet, Ctr Mol Med, Utrecht, Netherlands
[4] Univ Med Ctr Utrecht, Dept Epidemiol, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
关键词
association studies; complex traits; genetics; multiple test correction; LOW-FREQUENCY; SIGNIFICANCE THRESHOLDS; DEMOGRAPHIC HISTORY; WIDE SIGNIFICANCE; VARIANTS; IDENTIFICATION; DISCOVERY; FRAMEWORK; COMMON; RISK;
D O I
10.1002/gepi.22032
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) of common disease have been hugely successful in implicating loci that modify disease risk. The bulk of these associations have proven robust and reproducible, in part due to community adoption of statistical criteria for claiming significant genotype-phenotype associations. As the cost of sequencing continues to drop, assembling large samples in global populations is becoming increasingly feasible. Sequencing studies interrogate not only common variants, as was true for genotyping-based GWAS, but variation across the full allele frequency spectrum, yielding many more (independent) statistical tests. We sought to empirically determine genome-wide significance thresholds for various analysis scenarios. Using whole-genome sequence data, we simulated sequencing-based disease studies of varying sample size and ancestry. We determined that future sequencing efforts in >2,000 samples of European, Asian, or admixed ancestry should set genome-wide significance at approximately P = 5 x 10(-9), and studies of African samples should apply a more stringent genome-wide significance threshold of P = 1 x 10(-9). Adoption of a revised multiple test correction will be crucial in avoiding irreproducible claims of association.
引用
收藏
页码:145 / 151
页数:7
相关论文
共 43 条
  • [1] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [2] An integrated map of genetic variation from 1,092 human genomes
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Schmidt, Jeanette P.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Dinh, Huyen
    Kovar, Christie
    Lee, Sandra
    Lewis, Lora
    Muzny, Donna
    Reid, Jeff
    Wang, Min
    Wang, Jun
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Li, Zhuo
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Su, Zhe
    Tai, Shuaishuai
    Tang, Meifang
    [J]. NATURE, 2012, 491 (7422) : 56 - 65
  • [3] Integrating common and rare genetic variation in diverse human populations
    Altshuler, David M.
    Gibbs, Richard A.
    Peltonen, Leena
    Dermitzakis, Emmanouil
    Schaffner, Stephen F.
    Yu, Fuli
    Bonnen, Penelope E.
    de Bakker, Paul I. W.
    Deloukas, Panos
    Gabriel, Stacey B.
    Gwilliam, Rhian
    Hunt, Sarah
    Inouye, Michael
    Jia, Xiaoming
    Palotie, Aarno
    Parkin, Melissa
    Whittaker, Pamela
    Chang, Kyle
    Hawes, Alicia
    Lewis, Lora R.
    Ren, Yanru
    Wheeler, David
    Muzny, Donna Marie
    Barnes, Chris
    Darvishi, Katayoon
    Hurles, Matthew
    Korn, Joshua M.
    Kristiansson, Kati
    Lee, Charles
    McCarroll, Steven A.
    Nemesh, James
    Keinan, Alon
    Montgomery, Stephen B.
    Pollack, Samuela
    Price, Alkes L.
    Soranzo, Nicole
    Gonzaga-Jauregui, Claudia
    Anttila, Verneri
    Brodeur, Wendy
    Daly, Mark J.
    Leslie, Stephen
    McVean, Gil
    Moutsianas, Loukas
    Nguyen, Huy
    Zhang, Qingrun
    Ghori, Mohammed J. R.
    McGinnis, Ralph
    McLaren, William
    Takeuchi, Fumihiko
    Grossman, Sharon R.
    [J]. NATURE, 2010, 467 (7311) : 52 - 58
  • [4] [Anonymous], 2013, NAT NEUROSCI, V16, P517
  • [5] African genetic diversity: Implications for human demographic history, modern human origins, and complex disease mapping
    Campbell, Michael C.
    Tishkoff, Sarah A.
    [J]. ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2008, 9 : 403 - 433
  • [6] Second-generation PLINK: rising to the challenge of larger and richer datasets
    Chang, Christopher C.
    Chow, Carson C.
    Tellier, Laurent C. A. M.
    Vattikuti, Shashaank
    Purcell, Shaun M.
    Lee, James J.
    [J]. GIGASCIENCE, 2015, 4
  • [7] A framework for variation discovery and genotyping using next-generation DNA sequencing data
    DePristo, Mark A.
    Banks, Eric
    Poplin, Ryan
    Garimella, Kiran V.
    Maguire, Jared R.
    Hartl, Christopher
    Philippakis, Anthony A.
    del Angel, Guillermo
    Rivas, Manuel A.
    Hanna, Matt
    McKenna, Aaron
    Fennell, Tim J.
    Kernytsky, Andrew M.
    Sivachenko, Andrey Y.
    Cibulskis, Kristian
    Gabriel, Stacey B.
    Altshuler, David
    Daly, Mark J.
    [J]. NATURE GENETICS, 2011, 43 (05) : 491 - +
  • [8] Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction
    Do, Ron
    Stitziel, Nathan O.
    Won, Hong-Hee
    Jorgensen, Anders Berg
    Duga, Stefano
    Merlini, Pier Angelica
    Kiezun, Adam
    Farrall, Martin
    Goel, Anuj
    Zuk, Or
    Guella, Illaria
    Asselta, Rosanna
    Lange, Leslie A.
    Peloso, Gina M.
    Auer, Paul L.
    Girelli, Domenico
    Martinelli, Nicola
    Farlow, Deborah N.
    DePristo, Mark A.
    Roberts, Robert
    Stewart, Alexander F. R.
    Saleheen, Danish
    Danesh, John
    Epstein, Stephen E.
    Sivapalaratnam, Suthesh
    Hovingh, G. Kees
    Kastelein, John J.
    Samani, Nilesh J.
    Schunkert, Heribert
    Erdmann, Jeanette
    Shah, Svati H.
    Kraus, William E.
    Davies, Robert
    Nikpay, Majid
    Johansen, Christopher T.
    Wang, Jian
    Hegele, Robert A.
    Hechter, Eliana
    Marz, Winfried
    Kleber, Marcus E.
    Huang, Jie
    Johnson, Andrew D.
    Li, Mingyao
    Burke, Greg L.
    Gross, Myron
    Liu, Yongmei
    Assimes, Themistocles L.
    Heiss, Gerardo
    Lange, Ethan M.
    Folsom, Aaron R.
    [J]. NATURE, 2015, 518 (7537) : 102 - +
  • [9] Estimation of significance thresholds for genomewide association scans
    Dudbridge, Frank
    Gusnanto, Arief
    [J]. GENETIC EPIDEMIOLOGY, 2008, 32 (03) : 227 - 234
  • [10] The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants
    Fadista, Joao
    Manning, Alisa K.
    Florez, Jose C.
    Groop, Leif
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2016, 24 (08) : 1202 - 1205