Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

被引:0
作者
Marion Naveau
Guillaume Kon Kam King
Renaud Rincent
Laure Sansonnet
Maud Delattre
机构
[1] UMR MIA Paris-Saclay,Université Paris
[2] MaIAGE,Saclay, AgroParisTech, INRAE
[3] GQE - Le Moulon,Université Paris
来源
Statistics and Computing | 2024年 / 34卷
关键词
High-dimension; Non-linear mixed-effects models; SAEM algorithm; Spike-and-slab prior; Variable selection;
D O I
暂无
中图分类号
学科分类号
摘要
High-dimensional variable selection, with many more covariates than observations, is widely documented in standard regression models, but there are still few tools to address it in non-linear mixed-effects models where data are collected repeatedly on several individuals. In this work, variable selection is approached from a Bayesian perspective and a selection procedure is proposed, combining the use of a spike-and-slab prior and the Stochastic Approximation version of the Expectation Maximisation (SAEM) algorithm. Similarly to Lasso regression, the set of relevant covariates is selected by exploring a grid of values for the penalisation parameter. The SAEM approach is much faster than a classical Markov chain Monte Carlo algorithm and our method shows very good selection performances on simulated data. Its flexibility is demonstrated by implementing it for a variety of nonlinear mixed effects models. The usefulness of the proposed method is illustrated on a problem of genetic markers identification, relevant for genomic-assisted selection in plant breeding.
引用
收藏
相关论文
共 112 条
  • [1] Allassonnière S(2021)On the curved exponential family in the Stochatic approximation expectation maximization algorithm ESAIM: Probab. Stat. 25 408-432
  • [2] Debavelaere V(2004)Optimal predictive model selection Ann. Stat. 32 870-897
  • [3] Barbieri MM(2013)Multiple single nucleotide polymorphism analysis using penalized regression in nonlinear mixed-effect pharmacokinetic models Pharmacogenet. Genomics 23 167-174
  • [4] Berger JO(2012)Needles and straw in a haystack: posterior concentration for possibly sparse sequences Ann. Stat. 40 2069-2101
  • [5] Bertrand J(2020)BWGS: a R package for genomic selection and its application to a wheat breeding programme PLOS ONE 15 759-771
  • [6] Balding DJ(2008)Extended Bayesian information criteria for model selection with large model spaces Biometrika 95 403-417
  • [7] Castillo I(2017)Programming with models: writing statistical algorithms for general model structures with NIMBLE J. Comput. Graph. Stat. 26 456-475
  • [8] van der Vaart A(2014)A note on BIC in mixed-effects models Electron. J. Stat. 8 94-128
  • [9] Charmet G(1999)Convergence of a stochastic approximation version of the EM algorithm Ann. Stat. 27 1-22
  • [10] Tran LG(1977)Maximum likelihood from incomplete data via the EM algorithm J. R. Stat. Soc.: Ser. B (Methodol.) 39 921-931