Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome

被引:0
|
作者
Datta, Jyotishka [1 ]
Bandyopadhyay, Dipankar [2 ]
机构
[1] Virginia Polytech Inst & State Univ, Dept Stat, 250 Drillfield Dr, Blacksburg, VA 24061 USA
[2] Virginia Commonwealth Univ, Sch Populat Hlth, Dept Biostat, One Capital Sq,7th Floor,830 East Main St,POB 9800, Richmond, VA 23298 USA
基金
美国国家卫生研究院;
关键词
Bayesian; Compositional data; Generalized Dirichlet; Dirichlet; Large p; Shrinkage prior; Sparse probability vectors; Stick-breaking; Horseshoe; ASYMPTOTIC PROPERTIES; PRIORS; ESTIMATOR; INFERENCE; RISK;
D O I
10.1007/s41096-024-00194-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Microbiome studies generate multivariate compositional responses, such as taxa counts, which are strictly non-negative, bounded, residing within a simplex, and subject to unit-sum constraint. In presence of covariates (which can be moderate to high dimensional), they are popularly modeled via the Dirichlet-Multinomial (D-M) regression framework. In this paper, we consider a Bayesian approach for estimation and inference under a D-M compositional framework, and present a comparative evaluation of some state-of-the-art continuous shrinkage priors for efficient variable selection to identify the most significant associations between available covariates, and taxonomic abundance. Specifically, we compare the performances of the horseshoe and horseshoe+ priors (with the benchmark Bayesian lasso), utilizing Hamiltonian Monte Carlo techniques for posterior sampling, and generating posterior credible intervals. Our simulation studies using synthetic data demonstrate excellent recovery and estimation accuracy of sparse parameter regime by the continuous shrinkage priors. We further illustrate our method via application to a motivating oral microbiome data generated from the NYC-Hanes study. RStan implementation of our method is made available at the GitHub link: (https://github.com/dattahub/compshrink).
引用
收藏
页码:491 / 515
页数:25
相关论文
共 50 条
  • [1] Bayesian Graphical Compositional Regression for Microbiome Data
    Mao, Jialiang
    Chen, Yuhan
    Ma, Li
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 610 - 624
  • [2] Shared Bayesian variable shrinkage in multinomial logistic regression
    Uddin, Md Nazir
    Gaskins, Jeremy T.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 177
  • [3] A Bayesian joint model for compositional mediation effect selection in microbiome data
    Fu, Jingyan
    Koslovsky, Matthew D.
    Neophytou, Andreas M.
    Vannucci, Marina
    STATISTICS IN MEDICINE, 2023, 42 (17) : 2999 - 3015
  • [4] Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models
    Yamaguchi, Kazuhiro
    Zhang, Jihong
    JOURNAL OF EDUCATIONAL MEASUREMENT, 2023, 60 (02) : 202 - 234
  • [5] Variable selection in regression with compositional covariates
    Lin, Wei
    Shi, Pixu
    Feng, Rui
    Li, Hongzhe
    BIOMETRIKA, 2014, 101 (04) : 785 - 797
  • [6] Bayesian variable selection and estimation in binary quantile regression using global-local shrinkage priors
    Ma, Zhuanzhuan
    Han, Zifei
    Wang, Min
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023,
  • [7] Interquantile shrinkage and variable selection in quantile regression
    Jiang, Liewen
    Bondell, Howard D.
    Wang, Huixia Judy
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 69 : 208 - 219
  • [8] Bayesian semiparametric variable selection with applications to periodontal data
    Cai, Bo
    Bandyopadhyay, Dipankar
    STATISTICS IN MEDICINE, 2017, 36 (14) : 2251 - 2264
  • [9] Bayesian variable selection in quantile regression
    Yu, Keming
    Chen, Cathy W. S.
    Reed, Craig
    Dunson, David B.
    STATISTICS AND ITS INTERFACE, 2013, 6 (02) : 261 - 274
  • [10] Bayesian variable selection for logistic regression
    Tian, Yiqing
    Bondell, Howard D.
    Wilson, Alyson
    STATISTICAL ANALYSIS AND DATA MINING, 2019, 12 (05) : 378 - 393