Bayesian hierarchical hypothesis testing in large-scale genome-wide association analysis

被引:0
|
作者
Samaddar, Anirban [1 ]
Maiti, Tapabrata [1 ]
de los Campos, Gustavo [1 ,2 ,3 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA
[3] Michigan State Univ, Inst Quantitat Hlth Sci & Engn, E Lansing, MI 48824 USA
关键词
Bayesian variable selection; Bayesian hierarchical hypothesis testing; false discovery rate; GWAS; collinearity; multiresolution inference; spike and slab prior; linkage disequilibrium; UK-Biobank data; FALSE DISCOVERY RATE; VARIABLE-SELECTION; REGRESSION; HERITABILITY; PREDICTION;
D O I
10.1093/genetics/iyae164
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Variable selection and large-scale hypothesis testing are techniques commonly used to analyze high-dimensional genomic data. Despite recent advances in theory and methodology, variable selection and inference with highly collinear features remain challenging. For instance, collinearity poses a great challenge in genome-wide association studies involving millions of variants, many of which may be in high linkage disequilibrium. In such settings, collinearity can significantly reduce the power of variable selection methods to identify individual variants associated with an outcome. To address such challenges, we developed a Bayesian hierarchical hypothesis testing (BHHT)-a novel multiresolution testing procedure that offers high power with adequate error control and fine-mapping resolution. We demonstrate through simulations that the proposed methodology has a power-FDR performance that is competitive with (and in many scenarios better than) state-of-the-art methods. Finally, we demonstrate the feasibility of using BHHT with large sample size ( n similar to 300,000) and ultra dimensional genotypes (similar to 15 million single-nucleotide polymorphisms or SNPs) by applying it to eight complex traits using data from the UK-Biobank. Our results show that the proposed methodology leads to many more discoveries than those obtained using traditional SNP-centered inference procedures. The article is accompanied by open-source software that implements the methods described in this study using algorithms that scale to biobank-size ultra-high-dimensional data.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Genome-wide association testing beyond SNPs
    Harris, Laura
    McDonagh, Ellen M.
    Zhang, Xiaolei
    Fawcett, Katherine
    Foreman, Amy
    Daneck, Petr
    Sergouniotis, Panagiotis I.
    Parkinson, Helen
    Mazzarotto, Francesco
    Inouye, Michael
    Hollox, Edward J.
    Birney, Ewan
    Fitzgerald, Tomas
    NATURE REVIEWS GENETICS, 2025, 26 (03) : 156 - 170
  • [32] A hybrid bayesian approach for genome-wide association studies on related individuals
    Yazdani, A.
    Dunson, D. B.
    BIOINFORMATICS, 2015, 31 (24) : 3890 - 3896
  • [33] Bayesian genome-wide association study of nut traits in Japanese chestnut
    Nishio, Sogo
    Hayashi, Takeshi
    Yamamoto, Toshiya
    Terakami, Shingo
    Iwata, Hiroyoshi
    Imai, Atsushi
    Takada, Norio
    Kato, Hidenori
    Saito, Toshihiro
    MOLECULAR BREEDING, 2018, 38 (08)
  • [34] Bayesian Variable Selection with Genome-wide Association Studies
    Bangchang, Kannat Na
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2024, 45 (02) : 613 - 620
  • [35] Weighted multiple testing procedures in genome-wide association studies
    Obry, Ludivine
    Dalmasso, Cyril
    PEERJ, 2023, 11
  • [36] A two-phase Bayesian methodology for the analysis of binary phenotypes in genome-wide association studies
    Joyner, Chase
    McMahan, Christopher
    Baurley, James
    Pardamean, Bens
    BIOMETRICAL JOURNAL, 2020, 62 (01) : 191 - 201
  • [37] Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson's disease
    Nalls, Mike A.
    Pankratz, Nathan
    Lill, Christina M.
    Do, Chuong B.
    Hernandez, Dena G.
    Saad, Mohamad
    DeStefano, Anita L.
    Kara, Eleanna
    Bras, Jose
    Sharma, Manu
    Schulte, Claudia
    Keller, Margaux F.
    Arepalli, Sampath
    Letson, Christopher
    Edsall, Connor
    Stefansson, Hreinn
    Liu, Xinmin
    Pliner, Hannah
    Lee, Joseph H.
    Cheng, Rong
    Ikram, M. Arfan
    Ioannidis, John P. A.
    Hadjigeorgiou, Georgios M.
    Bis, Joshua C.
    Martinez, Maria
    Perlmutter, Joel S.
    Goate, Alison
    Marder, Karen
    Fiske, Brian
    Sutherland, Margaret
    Xiromerisiou, Georgia
    Myers, Richard H.
    Clark, Lorraine N.
    Stefansson, Kari
    Hardy, John A.
    Heutink, Peter
    Chen, Honglei
    Wood, Nicholas W.
    Houlden, Henry
    Payami, Haydeh
    Brice, Alexis
    Scott, William K.
    Gasser, Thomas
    Bertram, Lars
    Eriksson, Nicholas
    Foroud, Tatiana
    Singleton, Andrew B.
    NATURE GENETICS, 2014, 46 (09) : 989 - +
  • [38] The genetic architecture of dog ownership: large-scale genome-wide association study in 97,552 European-ancestry individuals
    Gong, Tong
    Karlsson, Robert
    Yao, Shuyang
    Magnusson, Patrik K. E.
    Ajnakina, Olesya
    Steptoe, Andrew
    Bhatta, Laxmi
    Brumpton, Ben
    Kumar, Ashish
    Melen, Erik
    Lin, Keng-Han
    Tian, Chao
    Fall, Tove
    Almqvist, Catarina
    G3-GENES GENOMES GENETICS, 2024, 14 (08):
  • [39] Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice
    Emmert-Streib, Frank
    Dehmer, Matthias
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (02): : 653 - 683
  • [40] A univariate perspective of multivariate genome-wide association analysis
    Guo, Xiaobo
    Zhu, Junxian
    Fan, Qiao
    He, Mingguang
    Wang, Xueqin
    Zhang, Heping
    GENETIC EPIDEMIOLOGY, 2018, 42 (05) : 470 - 479