Bayesian hierarchical hypothesis testing in large-scale genome-wide association analysis

被引:0
|
作者
Samaddar, Anirban [1 ]
Maiti, Tapabrata [1 ]
de los Campos, Gustavo [1 ,2 ,3 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA
[3] Michigan State Univ, Inst Quantitat Hlth Sci & Engn, E Lansing, MI 48824 USA
关键词
Bayesian variable selection; Bayesian hierarchical hypothesis testing; false discovery rate; GWAS; collinearity; multiresolution inference; spike and slab prior; linkage disequilibrium; UK-Biobank data; FALSE DISCOVERY RATE; VARIABLE-SELECTION; REGRESSION; HERITABILITY; PREDICTION;
D O I
10.1093/genetics/iyae164
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Variable selection and large-scale hypothesis testing are techniques commonly used to analyze high-dimensional genomic data. Despite recent advances in theory and methodology, variable selection and inference with highly collinear features remain challenging. For instance, collinearity poses a great challenge in genome-wide association studies involving millions of variants, many of which may be in high linkage disequilibrium. In such settings, collinearity can significantly reduce the power of variable selection methods to identify individual variants associated with an outcome. To address such challenges, we developed a Bayesian hierarchical hypothesis testing (BHHT)-a novel multiresolution testing procedure that offers high power with adequate error control and fine-mapping resolution. We demonstrate through simulations that the proposed methodology has a power-FDR performance that is competitive with (and in many scenarios better than) state-of-the-art methods. Finally, we demonstrate the feasibility of using BHHT with large sample size ( n similar to 300,000) and ultra dimensional genotypes (similar to 15 million single-nucleotide polymorphisms or SNPs) by applying it to eight complex traits using data from the UK-Biobank. Our results show that the proposed methodology leads to many more discoveries than those obtained using traditional SNP-centered inference procedures. The article is accompanied by open-source software that implements the methods described in this study using algorithms that scale to biobank-size ultra-high-dimensional data.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] A Scalable Bayesian Method for Integrating Functional Information in Genome-wide Association Studies
    Yang, Jingjing
    Fritsche, Lars G.
    Zhou, Xiang
    Abecasis, Goncalo
    AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 101 (03) : 404 - 416
  • [42] Enabling genome-wide association testing with multiple diseases and no healthy controls
    Tom, Jennifer
    Chang, Diana
    Wuster, Art
    Mukhyala, Kiran
    Cuenco, Karen
    Cowgill, Amy
    Vogel, Jan
    Reeder, Jens
    Yaspan, Brian
    Hunkapiller, Julie
    Brauer, Matt
    Behrens, Tim
    Forrest, William
    Bhangale, Tushar
    GENE, 2019, 684 : 118 - 123
  • [43] Testing for Polygenic Effects in Genome-Wide Association Studies
    Pan, Wei
    Chen, Yue-Ming
    Wei, Peng
    GENETIC EPIDEMIOLOGY, 2015, 39 (04) : 306 - 316
  • [44] Comments on: Hierarchical inference for genome-wide association studies: a view onmethodology with software
    Heller, Ruth
    COMPUTATIONAL STATISTICS, 2020, 35 (01) : 47 - 52
  • [45] Leveraging large-scale multi-omics evidences to identify therapeutic targets from genome-wide association studies
    Lessard, Samuel
    Chao, Michael
    Reis, Kadri
    Beauvais, Mathieu
    Rajpal, Deepak K.
    Sloane, Jennifer
    Palta, Priit
    Klinger, Katherine
    de Rinaldis, Emanuele
    Shameer, Khader
    Chatelain, Clement
    BMC GENOMICS, 2024, 25 (01):
  • [46] Large Scale Genome-Wide Association Study (GWAS) of PTSD by the Psychiatric Genomics Consortium
    Duncan, Laramie E.
    Koenen, Karestan
    Ressler, Kerry
    Nievergelt, Caroline
    Liberzon, Israel
    Daly, Mark
    BIOLOGICAL PSYCHIATRY, 2016, 79 (09) : 164S - 164S
  • [47] Genome-wide association study using single marker analysis and Bayesian methods for the gonadosomatic index in the large yellow croaker
    Gao, Yuxue
    Dong, Linsong
    Xu, Shuangbin
    Xiao, Shijun
    Fang, Ming
    Wang, Zhiyong
    AQUACULTURE, 2018, 486 : 26 - 30
  • [48] Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies
    Mieth, Bettina
    Kloft, Marius
    Rodriguez, Juan Antonio
    Sonnenburg, Soren
    Vobruba, Robin
    Morcillo-Suarez, Carlos
    Farre, Xavier
    Marigorta, Urko M.
    Fehr, Ernst
    Dickhaus, Thorsten
    Blanchard, Gilles
    Schunk, Daniel
    Navarro, Arcadi
    Mueller, Klaus-Robert
    SCIENTIFIC REPORTS, 2016, 6
  • [49] MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing
    Zheng, Zihao
    Mergaert, Aisha M.
    Ong, Irene M.
    Shelef, Miriam A.
    Newton, Michael A.
    BIOINFORMATICS, 2021, 37 (17) : 2637 - 2643
  • [50] Bayesian meta-analysis across genome-wide association studies of diverse phenotypes
    Trochet, Holly
    Pirinen, Matti
    Band, Gavin
    Jostins, Luke
    McVean, Gilean
    Spencer, Chris C. A.
    GENETIC EPIDEMIOLOGY, 2019, 43 (05) : 532 - 547