Bayesian hierarchical hypothesis testing in large-scale genome-wide association analysis

被引:0
|
作者
Samaddar, Anirban [1 ]
Maiti, Tapabrata [1 ]
de los Campos, Gustavo [1 ,2 ,3 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA
[3] Michigan State Univ, Inst Quantitat Hlth Sci & Engn, E Lansing, MI 48824 USA
关键词
Bayesian variable selection; Bayesian hierarchical hypothesis testing; false discovery rate; GWAS; collinearity; multiresolution inference; spike and slab prior; linkage disequilibrium; UK-Biobank data; FALSE DISCOVERY RATE; VARIABLE-SELECTION; REGRESSION; HERITABILITY; PREDICTION;
D O I
10.1093/genetics/iyae164
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Variable selection and large-scale hypothesis testing are techniques commonly used to analyze high-dimensional genomic data. Despite recent advances in theory and methodology, variable selection and inference with highly collinear features remain challenging. For instance, collinearity poses a great challenge in genome-wide association studies involving millions of variants, many of which may be in high linkage disequilibrium. In such settings, collinearity can significantly reduce the power of variable selection methods to identify individual variants associated with an outcome. To address such challenges, we developed a Bayesian hierarchical hypothesis testing (BHHT)-a novel multiresolution testing procedure that offers high power with adequate error control and fine-mapping resolution. We demonstrate through simulations that the proposed methodology has a power-FDR performance that is competitive with (and in many scenarios better than) state-of-the-art methods. Finally, we demonstrate the feasibility of using BHHT with large sample size ( n similar to 300,000) and ultra dimensional genotypes (similar to 15 million single-nucleotide polymorphisms or SNPs) by applying it to eight complex traits using data from the UK-Biobank. Our results show that the proposed methodology leads to many more discoveries than those obtained using traditional SNP-centered inference procedures. The article is accompanied by open-source software that implements the methods described in this study using algorithms that scale to biobank-size ultra-high-dimensional data.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES
    Zhu, Xiang
    Stephens, Matthew
    ANNALS OF APPLIED STATISTICS, 2017, 11 (03) : 1561 - 1592
  • [2] BAYESIAN VARIABLE SELECTION REGRESSION FOR GENOME-WIDE ASSOCIATION STUDIES AND OTHER LARGE-SCALE PROBLEMS
    Guan, Yongtao
    Stephens, Matthew
    ANNALS OF APPLIED STATISTICS, 2011, 5 (03) : 1780 - 1815
  • [3] A Bayesian Hierarchical Framework for Pathway Analysis in Genome-Wide Association Studies
    Zhang, Lei
    Papachristou, Charalampos
    Choudhary, Pankaj K.
    Biswas, Swati
    HUMAN HEREDITY, 2020, 84 (06) : 240 - 255
  • [4] Genome-wide association studies and large-scale collaborations in epidemiology
    Psaty, Bruce M.
    Hofman, Albert
    EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2010, 25 (08) : 525 - 529
  • [5] Structured Genome-Wide Association Studies with Bayesian Hierarchical Variable Selection
    Zhao, Yize
    Zhu, Hongtu
    Lu, Zhaohua
    Knickmeyer, Rebecca C.
    Zou, Fei
    GENETICS, 2019, 212 (02) : 397 - 415
  • [6] A Large-Scale Genome-Wide Association Study in US Holstein Cattle
    Jiang, Jicai
    Ma, Li
    Prakapenka, Dzianis
    VanRaden, Paul M.
    Cole, John B.
    Da, Yang
    FRONTIERS IN GENETICS, 2019, 10
  • [7] Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
    Jian Xiao
    Wensheng Zhu
    Jianhua Guo
    BMC Bioinformatics, 14
  • [8] Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
    Xiao, Jian
    Zhu, Wensheng
    Guo, Jianhua
    BMC BIOINFORMATICS, 2013, 14
  • [9] A large-scale genome-wide association study meta-analysis of cannabis use disorder
    Johnson, Emma C.
    Demontis, Ditte
    Thorgeirsson, Thorgeir E.
    Walters, Raymond K.
    Polimanti, Renato
    Hatoum, Alexander S.
    Sanchez-Roige, Sandra
    Paul, Sarah E.
    Wendt, Frank R.
    Clarke, Toni-Kim
    Lai, Dongbing
    Reginsson, Gunnar W.
    Zhou, Hang
    He, June
    Baranger, David A. A.
    Gudbjartsson, Daniel F.
    Wedow, Robbee
    Adkins, Daniel E.
    Adkins, Amy E.
    Alexander, Jeffry
    Bacanu, Silviu-Alin
    Bigdeli, Tim B.
    Boden, Joseph
    Brown, Sandra A.
    Bucholz, Kathleen K.
    Bybjerg-Grauholm, Jonas
    Corley, Robin P.
    Degenhardt, Louisa
    Dick, Danielle M.
    Domingue, Benjamin W.
    Fox, Louis
    Goate, Alison M.
    Gordon, Scott D.
    Hack, Laura M.
    Hancock, Dana B.
    Hartz, Sarah M.
    Hickie, Ian B.
    Hougaard, David M.
    Krauter, Kenneth
    Lind, Penelope A.
    McClintick, Jeanette N.
    McQueen, Matthew B.
    Meyers, Jacquelyn L.
    Montgomery, Grant W.
    Mors, Ole
    Mortensen, Preben B.
    Nordentoft, Merete
    Pearson, John F.
    Peterson, Roseann E.
    Reynolds, Maureen D.
    LANCET PSYCHIATRY, 2020, 7 (12): : 1032 - 1045
  • [10] Large-scale genome-wide association study of coronary artery disease in genetically diverse populations
    Tcheandjieu, Catherine
    Zhu, Xiang
    Hilliard, Austin T.
    Clarke, Shoa L.
    Napolioni, Valerio
    Ma, Shining
    Lee, Kyung Min
    Fang, Huaying
    Chen, Fei
    Lu, Yingchang
    Tsao, Noah L.
    Raghavan, Sridharan
    Koyama, Satoshi
    Gorman, Bryan R.
    Vujkovic, Marijana
    Klarin, Derek
    Levin, Michael G.
    Sinnott-Armstrong, Nasa
    Wojcik, Genevieve L.
    Plomondon, Mary E.
    Maddox, Thomas M.
    Waldo, Stephen W.
    Bick, Alexander G.
    Pyarajan, Saiju
    Huang, Jie
    Song, Rebecca
    Ho, Yuk-Lam
    Buyske, Steven
    Kooperberg, Charles
    Haessler, Jeffrey
    Loos, Ruth J. F.
    Do, Ron
    Verbanck, Marie
    Chaudhary, Kumardeep
    North, Kari E.
    Avery, Christy L.
    Graff, Mariaelisa
    Haiman, Christopher A.
    Le Marchand, Loic
    Wilkens, Lynne R.
    Bis, Joshua C.
    Leonard, Hampton
    Shen, Botong
    Lange, Leslie A.
    Giri, Ayush
    Dikilitas, Ozan
    Kullo, Iftikhar J.
    Stanaway, Ian B.
    Jarvik, Gail P.
    Gordon, Adam S.
    NATURE MEDICINE, 2022, 28 (08) : 1679 - +