Mode jumping MCMC for Bayesian variable selection in GLMM

被引:13
作者
Hubin, Aliaksandr [1 ,2 ]
Storvik, Geir [1 ]
机构
[1] Univ Oslo, Dept Math, Oslo, Norway
[2] Univ Oslo, Moltke Moes Vei 35, N-0851 Oslo, Norway
关键词
Bayesian variable selection; Bayesian model averaging; Generalized linear mixed models; Auxiliary variables MCMC; Combinatorial optimization; High performance computations; MARGINAL LIKELIHOOD; INFERENCE; APPROXIMATIONS; ALGORITHM; SAMPLER;
D O I
10.1016/j.csda.2018.05.020
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Generalized linear mixed models (GLMM) are used for inference and prediction in a wide range of different applications providing a powerful scientific tool. An increasing number of sources of data are becoming available, introducing a variety of candidate explanatory variables for these models. Selection of an optimal combination of variables is thus becoming crucial. In a Bayesian setting, the posterior distribution of the models, based on the observed data, can be viewed as a relevant measure for the model evidence. The number of possible models increases exponentially in the number of candidate variables. Moreover, the space of models has numerous local extrema in terms of posterior model probabilities. To resolve these issues a novel MCMC algorithm for the search through the model space via efficient mode jumping for GLMMs is introduced. The algorithm is based on that marginal likelihoods can be efficiently calculated within each model. It is recommended that either exact expressions or precise approximations of marginal likelihoods are applied. The suggested algorithm is applied to simulated data, the famous U.S. crime data, protein activity data and epigenetic data and is compared to several existing approaches. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:281 / 297
页数:17
相关论文
共 62 条
  • [11] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [12] Beaumont MA, 2003, GENETICS, V164, P1139
  • [13] Spontaneous epigenetic variation in the Arabidopsis thaliana methylome
    Becker, Claude
    Hagmann, Joerg
    Mueller, Jonas
    Koenig, Daniel
    Stegle, Oliver
    Borgwardt, Karsten
    Weigel, Detlef
    [J]. NATURE, 2011, 480 (7376) : 245 - U127
  • [14] Bivand R, 2015, J STAT SOFTW, V63, P1
  • [15] Approximate Bayesian inference for spatial econometrics models
    Bivand, Roger S.
    Gomez-Rubio, Virgilio
    Rue, Havard
    [J]. SPATIAL STATISTICS, 2014, 9 : 146 - 165
  • [16] ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration
    Bottolo, Leonardo
    Chadeau-Hyam, Marc
    Hastie, David I.
    Langley, Sarah R.
    Petretto, Enrico
    Tiret, Laurence
    Tregouet, David
    Richardson, Sylvia
    [J]. BIOINFORMATICS, 2011, 27 (04) : 587 - 588
  • [17] Bayesian fractional polynomials
    Bove, Daniel Sabanes
    Held, Leonhard
    [J]. STATISTICS AND COMPUTING, 2011, 21 (03) : 309 - 324
  • [18] Chen TQ, 2014, PR MACH LEARN RES, V32, P1683
  • [19] Marginal likelihood from the Gibbs output
    Chib, S
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (432) : 1313 - 1321
  • [20] Marginal likelihood from the Metropolis-Hastings output
    Chib, S
    Jeliazkov, I
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (453) : 270 - 281