Missing data analysis and imputation via latent Gaussian Markov random fields

被引:5
作者
Gomez-Rubio, Virgilio [1 ]
Cameletti, Michela [2 ]
Blangiardo, Marta [3 ]
机构
[1] Univ Castilla La Mancha, Dept Math, Sch Ind Engn, Albacete, Spain
[2] Univ Bergamo, Dept Econ, Bergamo, Italy
[3] Imperial Coll London, Dept Epidemiol & Biostat, London, England
基金
英国医学研究理事会;
关键词
Imputation; missing values; GMRF; INLA; sensitivity analysis; MULTIPLE IMPUTATION;
D O I
10.2436/20.8080.02.124
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper recasts the problem of missing values in the covariates of a regression model as a latent Gaussian Markov random field (GMRF) model in a fully Bayesian framework. The proposed approach is based on the definition of the covariate imputation sub-model as a latent effect with a GMRF structure. This formulation works for continuous covariates but for categorical covariates a typical multiple imputation approach is employed. Both techniques can be easily combined for the case in which continuous and categorical variables have missing values. The resulting Bayesian hierarchical model naturally fits within the integrated nested Laplace approximation (INLA) framework, which is used for model fitting. Hence, this work fills an important gap in the INLA methodology as it allows to treat models with missing values in the covariates. As in any other fully Bayesian framework, by relying on INLA for model fitting it is possible to formulate a joint model for the data, the imputed covariates and their missingness mechanism. In this way, it is possible to tackle the more general problem of assessing the missingness mechanism by conducting a sensitivity analysis on the different alternatives to model the non-observed covariates. Finally, the proposed approach is illustrated in two examples on modeling health risk factors and disease mapping.
引用
收藏
页码:217 / 244
页数:28
相关论文
共 35 条
[1]   THE MULTINOMIAL-POISSON TRANSFORMATION [J].
BAKER, SG .
STATISTICIAN, 1994, 43 (04) :495-504
[2]   Modelling the presence of disease under spatial misalignment using Bayesian latent Gaussian models [J].
Barber, Xavier ;
Conesa, David ;
Lladosa, Silvia ;
Lopez-Quilez, Antonio .
GEOSPATIAL HEALTH, 2016, 11 (01) :11-20
[3]  
Blangiardo M, 2015, SPATIAL AND SPATIO-TEMPORAL BAYESIAN MODELS WITH R-INLA, P1, DOI 10.1002/9781118950203
[4]  
Brooks S, 2011, CH CRC HANDB MOD STA, pXIX
[5]  
Carpenter J., 2012, MULTIPLE IMPUTATION
[6]   Sensitivity analysis after multiple imputation under missing at random: a weighting approach [J].
Carpenter, James R. ;
Kenward, Michael G. ;
White, Ian R. .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2007, 16 (03) :259-275
[7]   A comparison of multiple imputation and doubly robust estimation for analyses with missing data [J].
Carpenter, James R. ;
Kenward, Michael G. ;
Vansteelandt, Stijn .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2006, 169 :571-584
[8]  
Cressie N., 2015, STAT SPATIAL DATA
[9]  
Enders C. K., 2022, Applied missing data analysis
[10]   Dealing with missing covariates in epidemiologic studies: a comparison between multiple imputation and a full Bayesian approach [J].
Erler, Nicole S. ;
Rizopoulos, Dimitris ;
van Rosmalen, Joost ;
Jaddoe, Vincent W. V. ;
Franco, Oscar H. ;
Lesaffre, Emmanuel M. E. H. .
STATISTICS IN MEDICINE, 2016, 35 (17) :2955-2974