Bayesian hierarchical hypothesis testing in large-scale genome-wide association analysis

被引：0

作者：

Samaddar, Anirban ^{[1
]}

Maiti, Tapabrata ^{[1
]}

de los Campos, Gustavo ^{[1
,2
,3
]}

机构：

[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA

[2] Michigan State Univ, Dept Epidemiol & Biostat, E Lansing, MI 48824 USA

[3] Michigan State Univ, Inst Quantitat Hlth Sci & Engn, E Lansing, MI 48824 USA

来源：

GENETICS | 2024年 / 228卷 / 04期

关键词：

Bayesian variable selection; Bayesian hierarchical hypothesis testing; false discovery rate; GWAS; collinearity; multiresolution inference; spike and slab prior; linkage disequilibrium; UK-Biobank data; FALSE DISCOVERY RATE; VARIABLE-SELECTION; REGRESSION; HERITABILITY; PREDICTION;

D O I：

10.1093/genetics/iyae164

中图分类号：

Q3 [遗传学];

学科分类号：

071007 ; 090102 ;

摘要：

Variable selection and large-scale hypothesis testing are techniques commonly used to analyze high-dimensional genomic data. Despite recent advances in theory and methodology, variable selection and inference with highly collinear features remain challenging. For instance, collinearity poses a great challenge in genome-wide association studies involving millions of variants, many of which may be in high linkage disequilibrium. In such settings, collinearity can significantly reduce the power of variable selection methods to identify individual variants associated with an outcome. To address such challenges, we developed a Bayesian hierarchical hypothesis testing (BHHT)-a novel multiresolution testing procedure that offers high power with adequate error control and fine-mapping resolution. We demonstrate through simulations that the proposed methodology has a power-FDR performance that is competitive with (and in many scenarios better than) state-of-the-art methods. Finally, we demonstrate the feasibility of using BHHT with large sample size ( n similar to 300,000) and ultra dimensional genotypes (similar to 15 million single-nucleotide polymorphisms or SNPs) by applying it to eight complex traits using data from the UK-Biobank. Our results show that the proposed methodology leads to many more discoveries than those obtained using traditional SNP-centered inference procedures. The article is accompanied by open-source software that implements the methods described in this study using algorithms that scale to biobank-size ultra-high-dimensional data.

引用

页数：12

共 50 条

[41] A Scalable Bayesian Method for Integrating Functional Information in Genome-wide Association Studies
Yang, Jingjing
Fritsche, Lars G.
Zhou, Xiang
Abecasis, Goncalo
AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 101 (03) : 404 - 416
[42] Enabling genome-wide association testing with multiple diseases and no healthy controls
Tom, Jennifer
Chang, Diana
Wuster, Art
Mukhyala, Kiran
Cuenco, Karen
Cowgill, Amy
Vogel, Jan
Reeder, Jens
Yaspan, Brian
Hunkapiller, Julie
Brauer, Matt
Behrens, Tim
Forrest, William
Bhangale, Tushar
GENE, 2019, 684 : 118 - 123
[43] Testing for Polygenic Effects in Genome-Wide Association Studies
Pan, Wei
Chen, Yue-Ming
Wei, Peng
GENETIC EPIDEMIOLOGY, 2015, 39 (04) : 306 - 316
[44] Comments on: Hierarchical inference for genome-wide association studies: a view onmethodology with software
Heller, Ruth
COMPUTATIONAL STATISTICS, 2020, 35 (01) : 47 - 52
[45] Leveraging large-scale multi-omics evidences to identify therapeutic targets from genome-wide association studies
Lessard, Samuel
Chao, Michael
Reis, Kadri
Beauvais, Mathieu
Rajpal, Deepak K.
Sloane, Jennifer
Palta, Priit
Klinger, Katherine
de Rinaldis, Emanuele
Shameer, Khader
Chatelain, Clement
BMC GENOMICS, 2024, 25 (01):
[46] Large Scale Genome-Wide Association Study (GWAS) of PTSD by the Psychiatric Genomics Consortium
Duncan, Laramie E.
Koenen, Karestan
Ressler, Kerry
Nievergelt, Caroline
Liberzon, Israel
Daly, Mark
BIOLOGICAL PSYCHIATRY, 2016, 79 (09) : 164S - 164S
[47] Genome-wide association study using single marker analysis and Bayesian methods for the gonadosomatic index in the large yellow croaker
Gao, Yuxue
Dong, Linsong
Xu, Shuangbin
Xiao, Shijun
Fang, Ming
Wang, Zhiyong
AQUACULTURE, 2018, 486 : 26 - 30
[48] Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies
Mieth, Bettina
Kloft, Marius
Rodriguez, Juan Antonio
Sonnenburg, Soren
Vobruba, Robin
Morcillo-Suarez, Carlos
Farre, Xavier
Marigorta, Urko M.
Fehr, Ernst
Dickhaus, Thorsten
Blanchard, Gilles
Schunk, Daniel
Navarro, Arcadi
Mueller, Klaus-Robert
SCIENTIFIC REPORTS, 2016, 6
[49] MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing
Zheng, Zihao
Mergaert, Aisha M.
Ong, Irene M.
Shelef, Miriam A.
Newton, Michael A.
BIOINFORMATICS, 2021, 37 (17) : 2637 - 2643
[50] Bayesian meta-analysis across genome-wide association studies of diverse phenotypes
Trochet, Holly
Pirinen, Matti
Band, Gavin
Jostins, Luke
McVean, Gilean
Spencer, Chris C. A.
GENETIC EPIDEMIOLOGY, 2019, 43 (05) : 532 - 547

← 1 2 3 4 5 →