A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies

被引:27
作者
Kalaycioglu, Oya [1 ,2 ]
Copas, Andrew [3 ]
King, Michael [1 ]
Omar, Rumana Z. [1 ]
机构
[1] UCL, London WC1E 6BT, England
[2] Abant Izzet Baysal Univ, Bolu, Turkey
[3] MRC, Clin Trials Unit, London, England
关键词
Bayesian imputation; Imputation by chained equations; Missing data; Multilevel data; Multiple imputation; Multivariate normal imputation; OUTCOMES;
D O I
10.1111/rssa.12140
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Multiple-imputation (MI) methods for imputing missing data in observational health studies with repeated measurements were evaluated with particular focus on incomplete time varying explanatory variables. Standard and random-effects imputation by chained equations, multivariate normal imputation and Bayesian MI were compared regarding bias and efficiency of regression coefficient estimates by using simulation studies. Flexibility of the methods in handling different types of variables (binary, categorical, skewed and normally distributed) and correlations between the repeated measurements of the incomplete variables were also compared. Multivariate normal imputation produced the least bias in most situations, is theoretically well justified and allows flexible correlation for the repeated measurements. It can be recommended for imputing continuous variables. Bayesian MI is efficient and may be preferable in the presence of categorical and non-normally distributed continuous variables. Imputation by chained equations approaches were sensitive to the correlation between the repeated measurements. The moving time window approach may be used for normally distributed continuous variables with auto-regressive correlation.
引用
收藏
页码:683 / 706
页数:24
相关论文
共 37 条
[1]  
[Anonymous], 2007, Missing Data in Clinical Studies. Statistics in Practice
[2]  
[Anonymous], 2011, Stata statistical software: Release 12
[3]  
Aucejo E. M., 2013, IDENTIFICATION INFER
[4]  
Carpenter J.R., 2005, Multilevel Modeling Newsletter, V16, P9
[5]  
Carpenter JR, 2011, J STAT SOFTW, V45, P1
[6]  
Carrigan G., 2007, J STAT SOFTWR, V19
[7]   Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessment [J].
Demirtas, Hakan ;
Freels, Sally A. ;
Yucel, Recai M. .
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2008, 78 (01) :69-84
[8]   Methods for handling dropouts in longitudinal clinical trials [J].
Fitzmaurice, GM .
STATISTICA NEERLANDICA, 2003, 57 (01) :75-99
[9]   Multilevel models with multivariate mixed response types [J].
Goldstein, Harvey ;
Carpenter, James ;
Kenward, Michael G. ;
Levin, Kate A. .
STATISTICAL MODELLING, 2009, 9 (03) :173-197
[10]  
Harrell F. E., 2001, Regression modelling strategies: with applications to linear models, logistic regression, and survival analysis, DOI DOI 10.1007/978-1-4757-3462-1