A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies

被引:27
作者
Kalaycioglu, Oya [1 ,2 ]
Copas, Andrew [3 ]
King, Michael [1 ]
Omar, Rumana Z. [1 ]
机构
[1] UCL, London WC1E 6BT, England
[2] Abant Izzet Baysal Univ, Bolu, Turkey
[3] MRC, Clin Trials Unit, London, England
关键词
Bayesian imputation; Imputation by chained equations; Missing data; Multilevel data; Multiple imputation; Multivariate normal imputation; OUTCOMES;
D O I
10.1111/rssa.12140
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Multiple-imputation (MI) methods for imputing missing data in observational health studies with repeated measurements were evaluated with particular focus on incomplete time varying explanatory variables. Standard and random-effects imputation by chained equations, multivariate normal imputation and Bayesian MI were compared regarding bias and efficiency of regression coefficient estimates by using simulation studies. Flexibility of the methods in handling different types of variables (binary, categorical, skewed and normally distributed) and correlations between the repeated measurements of the incomplete variables were also compared. Multivariate normal imputation produced the least bias in most situations, is theoretically well justified and allows flexible correlation for the repeated measurements. It can be recommended for imputing continuous variables. Bayesian MI is efficient and may be preferable in the presence of categorical and non-normally distributed continuous variables. Imputation by chained equations approaches were sensitive to the correlation between the repeated measurements. The moving time window approach may be used for normally distributed continuous variables with auto-regressive correlation.
引用
收藏
页码:683 / 706
页数:24
相关论文
共 37 条
[11]  
Kenward M., 2005, EXAMPLE ANAL USING W
[12]   The relationship between patients' experiences of continuity of cancer care and health outcomes: a mixed methods study [J].
King, M. ;
Jones, L. ;
Richardson, A. ;
Murad, S. ;
Irving, A. ;
Aslett, H. ;
Ramsay, A. ;
Coelho, H. ;
Andreou, P. ;
Tookman, A. ;
Mason, C. ;
Nazareth, I. .
BRITISH JOURNAL OF CANCER, 2008, 98 (03) :529-536
[13]   Multiple Imputation for Missing Data: Fully Conditional Specification Versus Multivariate Normal Imputation [J].
Lee, Katherine J. ;
Carlin, John B. .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 171 (05) :624-632
[14]   Combining MCMC with 'sequential' PKPD modelling [J].
Lunn, David ;
Best, Nicky ;
Spiegelhalter, David ;
Graham, Gordon ;
Neuenschwander, Beat .
JOURNAL OF PHARMACOKINETICS AND PHARMACODYNAMICS, 2009, 36 (01) :19-38
[15]   WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility [J].
Lunn, DJ ;
Thomas, A ;
Best, N ;
Spiegelhalter, D .
STATISTICS AND COMPUTING, 2000, 10 (04) :325-337
[16]   Missing values in longitudinal dietary data: A multiple imputation approach based on a fully conditional specification [J].
Nevalainen, Jaakko ;
Kenward, Michael G. ;
Virtanen, Suvi A. .
STATISTICS IN MEDICINE, 2009, 28 (29) :3657-3669
[17]  
Oudshoorn K. G., 2000, PGVGZ00038 TOEG NAT
[18]   The epidemiology of arm and hand swelling in premenopausal breast cancer survivors [J].
Paskett, Electra D. ;
Naughton, Michelle J. ;
McCoy, Thomas P. ;
Case, L. Douglas ;
Abbott, Jill M. .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2007, 16 (04) :775-782
[19]   Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey [J].
Peyre, Hugo ;
Leplege, Alain ;
Coste, Joel .
QUALITY OF LIFE RESEARCH, 2011, 20 (02) :287-300
[20]  
Pournelle G. H., 1953, Journal of Mammalogy, V34, P133, DOI 10.1890/0012-9658(2002)083[1421:SDEOLC]2.0.CO