Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

被引：0

作者：

Marion Naveau

Guillaume Kon Kam King

Renaud Rincent

Laure Sansonnet

Maud Delattre

机构：

[1] UMR MIA Paris-Saclay,Université Paris

[2] MaIAGE,Saclay, AgroParisTech, INRAE

[3] GQE - Le Moulon,Université Paris

来源：

Statistics and Computing | 2024年 / 34卷

关键词：

High-dimension; Non-linear mixed-effects models; SAEM algorithm; Spike-and-slab prior; Variable selection;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

High-dimensional variable selection, with many more covariates than observations, is widely documented in standard regression models, but there are still few tools to address it in non-linear mixed-effects models where data are collected repeatedly on several individuals. In this work, variable selection is approached from a Bayesian perspective and a selection procedure is proposed, combining the use of a spike-and-slab prior and the Stochastic Approximation version of the Expectation Maximisation (SAEM) algorithm. Similarly to Lasso regression, the set of relevant covariates is selected by exploring a grid of values for the penalisation parameter. The SAEM approach is much faster than a classical Markov chain Monte Carlo algorithm and our method shows very good selection performances on simulated data. Its flexibility is demonstrated by implementing it for a variety of nonlinear mixed effects models. The usefulness of the proposed method is illustrated on a problem of genetic markers identification, relevant for genomic-assisted selection in plant breeding.

引用

共 112 条

[1] Allassonnière S(2021)On the curved exponential family in the Stochatic approximation expectation maximization algorithm ESAIM: Probab. Stat. 25 408-432
[2] Debavelaere V(2004)Optimal predictive model selection Ann. Stat. 32 870-897
[3] Barbieri MM(2013)Multiple single nucleotide polymorphism analysis using penalized regression in nonlinear mixed-effect pharmacokinetic models Pharmacogenet. Genomics 23 167-174
[4] Berger JO(2012)Needles and straw in a haystack: posterior concentration for possibly sparse sequences Ann. Stat. 40 2069-2101
[5] Bertrand J(2020)BWGS: a R package for genomic selection and its application to a wheat breeding programme PLOS ONE 15 759-771
[6] Balding DJ(2008)Extended Bayesian information criteria for model selection with large model spaces Biometrika 95 403-417
[7] Castillo I(2017)Programming with models: writing statistical algorithms for general model structures with NIMBLE J. Comput. Graph. Stat. 26 456-475
[8] van der Vaart A(2014)A note on BIC in mixed-effects models Electron. J. Stat. 8 94-128
[9] Charmet G(1999)Convergence of a stochastic approximation version of the EM algorithm Ann. Stat. 27 1-22
[10] Tran LG(1977)Maximum likelihood from incomplete data via the EM algorithm J. R. Stat. Soc.: Ser. B (Methodol.) 39 921-931

← 1 2 3 4 5 6 7 8 9 10 →