Missing data imputation: focusing on single imputation

被引:508
作者
Zhang, Zhongheng [1 ]
机构
[1] Zhejiang Univ, Jinhua Hosp, Jinhua Municipal Cent Hosp, Dept Crit Care Med, Jinhua 321000, Peoples R China
关键词
Big-data clinical trial; missing data; single imputation; longitudinal data; R;
D O I
10.3978/j.issn.2305-5839.2015.12.38
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Complete case analysis is widely used for handling missing data, and it is the default method in many statistical packages. However, this method may introduce bias and some useful information will be omitted from analysis. Therefore, many imputation methods are developed to make gap end. The present article focuses on single imputation. Imputations with mean, median and mode are simple but, like complete case analysis, can introduce bias on mean and deviation. Furthermore, they ignore relationship with other variables. Regression imputation can preserve relationship between missing values and other variables. There are many sophisticated methods exist to handle missing values in longitudinal data. This article focuses primarily on how to implement R code to perform single imputation, while avoiding complex mathematical calculations.
引用
收藏
页数:8
相关论文
共 11 条
[1]   Handling missing data in RCTs; a review of the top medical journals [J].
Bell, Melanie L. ;
Fiero, Mallorie ;
Horton, Nicholas J. ;
Hsu, Chiu-Hsieh .
BMC MEDICAL RESEARCH METHODOLOGY, 2014, 14
[2]   Bias due to missing exposure data using complete-case analysis in the proportional hazards regression model [J].
Demissie, S ;
LaValley, MP ;
Horton, NJ ;
Glynn, RJ ;
Cupples, LA .
STATISTICS IN MEDICINE, 2003, 22 (04) :545-557
[3]   Imputation of missing longitudinal data: a comparison of methods [J].
Engels, JM ;
Diehr, P .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2003, 56 (10) :968-976
[4]  
Genolini C., 2013, Open Journal of Statistics, V03, P26, DOI [DOI 10.4236/OJS.2013.34A004, 10.4236/ojs.2013.34A004]
[5]  
Genolini C, LONGITUDINALDATA LON
[6]   Unpredictable bias when using the missing indicator method or complete case analysis for missing confounder values: an empirical example [J].
Knol, Mirjam J. ;
Janssen, Kristel J. M. ;
Donders, A. Rogier T. ;
Egberts, Antoine C. G. ;
Heerdink, E. Rob ;
Grobbee, Diederick E. ;
Moons, Karel G. M. ;
Geerlings, Mirjam I. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2010, 63 (07) :728-736
[7]   Effects of Different Missing Data Imputation Techniques on the Performance of Undiagnosed Diabetes Risk Prediction Models in a Mixed-Ancestry Population of South Africa [J].
Masconi, Katya L. ;
Matsha, Tandi E. ;
Erasmus, Rajiv T. ;
Kengne, Andre P. .
PLOS ONE, 2015, 10 (09)
[8]   Attrition in longitudinal studies: How to deal with missing data [J].
Twisk, J ;
de Vente, W .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2002, 55 (04) :329-337
[9]  
van Buuren S, 2011, J STAT SOFTW, V45, P1
[10]   Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: A clinical example [J].
van der Heijden, Geert J. M. G. ;
Donders, A. Rogier T. ;
Stijnen, Theo ;
Moons, Karel G. M. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) :1102-1109