Fast and accurate Bayesian polygenic risk modeling with variational inference

被引:12
作者
Zabad, Shadi [1 ]
Gravel, Simon [2 ]
Li, Yue [1 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ, Canada
[2] McGill Univ, Dept Human Genet, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
GENOME-WIDE ASSOCIATION; HUMAN COMPLEX TRAITS; UK BIOBANK; VARIABLE SELECTION; MIXED-MODEL; PREDICTION; SCORES; RARE; REGRESSION; VARIANTS;
D O I
10.1016/j.ajhg.2023.03.009
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regres-sion framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient , do not scale favorably to higher di-mensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statis-tics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consis-tently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based ap-proaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities , independent GWAS co-horts. In addition to its competitive accuracy on the "White British"samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.
引用
收藏
页码:741 / 761
页数:22
相关论文
共 50 条
[31]   Dynamic Bayesian Network Modeling, Learning, and Inference: A Survey [J].
Shiguihara, Pedro ;
Lopes, Alneu De Andrade ;
Mauricio, David .
IEEE ACCESS, 2021, 9 :117639-117648
[32]   MUSSEL: Enhanced Bayesian polygenic risk prediction leveraging information across multiple ancestry groups [J].
Jin, Jin ;
Zhan, Jianan ;
Zhang, Jingning ;
Zhao, Ruzhang ;
O'Connell, Jared ;
Jiang, Yunxuan ;
Buyske, Steven ;
Gignoux, Christopher ;
Haiman, Christopher ;
Kenny, Eimear E. ;
Kooperberg, Charles ;
North, Kari ;
Koelsch, Bertram L. ;
Wojcik, Genevieve ;
Zhang, Haoyu ;
Chatterjee, Nilanjan .
CELL GENOMICS, 2024, 4 (04)
[33]   Fast Bayesian inference in a class of sparse linear mixed effects models [J].
Spyropoulou, Maria-Zafeiria ;
Hopker, James G. ;
Griffin, Jim E. .
STATISTICS AND COMPUTING, 2025, 35 (05)
[34]   A BAYESIAN GRAPHICAL MODELING APPROACH TO MICRORNA REGULATORY NETWORK INFERENCE [J].
Stingo, Francesco C. ;
Chen, Yian A. ;
Vannucci, Marina ;
Barrier, Marianne ;
Mirkes, Philip E. .
ANNALS OF APPLIED STATISTICS, 2010, 4 (04) :2024-2048
[35]   Probabilistic Solar Irradiation Forecasting Based on Variational Bayesian Inference With Secure Federated Learning [J].
Zhang, Xiaoning ;
Fang, Fang ;
Wang, Jiaqi .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (11) :7849-7859
[36]   Ultra-Fast Approximate Inference Using Variational Functional Mixed Models [J].
Huo, Shuning ;
Morris, Jeffrey S. ;
Zhu, Hongxiao .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (02) :353-365
[37]   Bayesian neural network and Bayesian physics-informed neural network via variational inference for seismic petrophysical inversion [J].
Li, Peng ;
Grana, Dario ;
Liu, Mingliang .
GEOPHYSICS, 2024, 89 (06) :M185-M196
[38]   Genome-wide Modeling of Polygenic Risk Score in Colorectal Cancer Risk [J].
Thomas, Minta ;
Sakoda, Lori C. ;
Hoffmeister, Michael ;
Rosenthal, Elisabeth A. ;
Lee, Jeffrey K. ;
van Duijnhoven, Franzel J. B. ;
Platz, Elizabeth A. ;
Wu, Anna H. ;
Dampier, Christopher H. ;
de la Chapelle, Albert ;
Wolk, Alicja ;
Joshi, Amit D. ;
Burnett-Hartman, Andrea ;
Gsur, Andrea ;
Lindblom, Annika ;
Castells, Antoni ;
Win, Aung Ko ;
Namjou, Bahram ;
Van Guelpen, Bethany ;
Tangen, Catherine M. ;
He, Qianchuan ;
Li, Christopher, I ;
Schafmayer, Clemens ;
Joshu, Corinne E. ;
Ulrich, Cornelia M. ;
Bishop, D. Timothy ;
Buchanan, Daniel D. ;
Schaid, Daniel ;
Drew, David A. ;
Muller, David C. ;
Duggan, David ;
Crosslin, David R. ;
Albanes, Demetrius ;
Giovannucci, Edward L. ;
Larson, Eric ;
Qu, Flora ;
Mentch, Frank ;
Giles, Graham G. ;
Hakonarson, Hakon ;
Hampel, Heather ;
Stanaway, Ian B. ;
Figueiredo, Jane C. ;
Huyghe, Jeroen R. ;
Minnier, Jessica ;
Chang-Claude, Jenny ;
Hampe, Jochen ;
Harley, John B. ;
Visvanathan, Kala ;
Curtis, Keith R. ;
Offit, Kenneth .
AMERICAN JOURNAL OF HUMAN GENETICS, 2020, 107 (03) :432-444
[39]   Fast and scalable ensemble learning method for versatile polygenic risk prediction [J].
Chen, Tony ;
Zhang, Haoyu ;
Mazumder, Rahul ;
Lin, Xihong .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (33)
[40]   Fast Bayesian inference of block Nearest Neighbor Gaussian models for large data [J].
Quiroz, Zaida C. ;
Prates, Marcos O. ;
Dey, Dipak K. ;
Rue, H. avard .
STATISTICS AND COMPUTING, 2023, 33 (02)