Bayesian compositional generalized linear models for analyzing microbiome data

被引:2
|
作者
Zhang, Li [1 ]
Zhang, Xinyan [2 ]
Yi, Nengjun [1 ]
机构
[1] Univ Alabama Birmingham, Dept Biostat, Birmingham, AL 35294 USA
[2] Kennesaw State Univ, Sch Data Sci & Analyt, Kennesaw, GA USA
关键词
Bayesian models; compositional data; MCMC; microbiome; sum-to-zero restriction; STATISTICAL-ANALYSIS; GUT MICROBIOTA; REGRESSION;
D O I
10.1002/sim.9946
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The crucial impact of the microbiome on human health and disease has gained significant scientific attention. Researchers seek to connect microbiome features with health conditions, aiming to predict diseases and develop personalized medicine strategies. However, the practicality of conventional models is restricted due to important aspects of microbiome data. Specifically, the data observed is compositional, as the counts within each sample are bound by a fixed-sum constraint. Moreover, microbiome data often exhibits high dimensionality, wherein the number of variables surpasses the available samples. In addition, microbiome features exhibiting phenotypical similarity usually have similar influence on the response variable. To address the challenges posed by these aspects of the data structure, we proposed Bayesian compositional generalized linear models for analyzing microbiome data (BCGLM) with a structured regularized horseshoe prior for the compositional coefficients and a soft sum-to-zero restriction on coefficients through the prior distribution. We fitted the proposed models using Markov Chain Monte Carlo (MCMC) algorithms with R package rstan. The performance of the proposed method was assessed by extensive simulation studies. The simulation results show that our approach outperforms existing methods with higher accuracy of coefficient estimates and lower prediction error. We also applied the proposed method to microbiome study to find microorganisms linked to inflammatory bowel disease (IBD). To make this work reproducible, the code and data used in this article are available at .
引用
收藏
页码:141 / 155
页数:15
相关论文
共 50 条
  • [1] Bayesian compositional generalized linear mixed models for disease prediction using microbiome data
    Zhang, Li
    Zhang, Xinyan
    Leach, Justin M.
    Rahman, A. K. M. F.
    Howell, Carrie R.
    Yi, Nengjun
    BMC BIOINFORMATICS, 2025, 26 (01):
  • [2] Generalized linear models with linear constraints for microbiome compositional data
    Lu, Jiarui
    Shi, Pixu
    Li, Hongzhe
    BIOMETRICS, 2019, 75 (01) : 235 - 244
  • [3] Bayesian compositional models for ordinal response
    Zhang, Li
    Zhang, Xinyan
    Leach, Justin M.
    Rahman, A. K. M. F.
    Yi, Nengjun
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2024, 33 (06) : 1043 - 1054
  • [4] Bayesian Graphical Compositional Regression for Microbiome Data
    Mao, Jialiang
    Chen, Yuhan
    Ma, Li
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 610 - 624
  • [5] Analyzing the overall effects of the microbiome abundance data with a Bayesian predictive value approach
    Zhang, Xinyan
    Yi, Nengjun
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (10) : 1992 - 2003
  • [6] Interaction Models and Generalized Score Matching for Compositional Data
    Yu, Shiqing
    Drton, Mathias
    Shojaie, Ali
    LEARNING ON GRAPHS CONFERENCE, VOL 231, 2023, 231
  • [7] Bayesian Generalized Horseshoe Estimation of Generalized Linear Models
    Schmidt, Daniel F.
    Makalic, Enes
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 598 - 613
  • [8] Data Augmentation for Compositional Data: Advancing Predictive Models of the Microbiome
    Gordon-Rodriguez, Elliott
    Quinn, Thomas P.
    Cunninghham, John P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [9] A Bayesian joint model for compositional mediation effect selection in microbiome data
    Fu, Jingyan
    Koslovsky, Matthew D.
    Neophytou, Andreas M.
    Vannucci, Marina
    STATISTICS IN MEDICINE, 2023, 42 (17) : 2999 - 3015
  • [10] A semiparametric Bayesian approach to generalized partial linear mixed models for longitudinal data
    Tang, Nian-Sheng
    Duan, Xing-De
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (12) : 4348 - 4365