Generating missing values for simulation purposes: a multivariate amputation procedure

被引:64
作者
Schouten, Rianne Margaretha [1 ]
Lugtig, Peter [1 ]
Vink, Gerko [1 ]
机构
[1] Univ Utrecht, Dept Methodol & Stat, Sjoerd Groenman Bldg,Padualaan 14, NL-3584 CH Utrecht, Netherlands
关键词
Missing data; multiple imputation; multivariate amputation; evaluation; MULTIPLE IMPUTATION; DATA DESIGNS; MODELS;
D O I
10.1080/00949655.2018.1491577
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Missing data form a ubiquitous problem in scientific research, especially since most statistical analyses require complete data. To evaluate the performance of methods dealing with missing data, researchers perform simulation studies. An important aspect of these studies is the generation of missing values in a simulated, complete data set: the amputation procedure. We investigated the methodological validity and statistical nature of both the current amputation practice and a newly developed and implemented multivariate amputation procedure. We found that the current way of practice may not be appropriate for the generation of intuitive and reliable missing data problems. The multivariate amputation procedure, on the other hand, generates reliable amputations and allows for a proper regulation of missing data problems. The procedure has additional features to generate any missing data scenario precisely as intended. Hence, the multivariate amputation procedure is an efficient method to accurately evaluate missing data methodology.
引用
收藏
页码:2909 / 2930
页数:22
相关论文
共 37 条
[1]   A comparison of various software tools for dealing with missing data via imputation [J].
Abrahantes, Jose Cortinas ;
Sotto, Cristina ;
Molenberghs, Geert ;
Vromman, Geert ;
Bierinckx, Bart .
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2011, 81 (11) :1653-1675
[2]   Multiple imputation for missing data - A cautionary tale [J].
Allison, PD .
SOCIOLOGICAL METHODS & RESEARCH, 2000, 28 (03) :301-309
[3]  
[Anonymous], 2016, STANDARDIZED EVALUAT
[4]  
[Anonymous], LANG ENV STAT COMP G
[5]  
[Anonymous], 201612 CBS
[6]  
[Anonymous], 2018, R PACKAGE MASS
[7]  
[Anonymous], MATH STAT APPL
[8]  
[Anonymous], 2007, Missing Data in Clinical Studies. Statistics in Practice
[9]  
[Anonymous], 2018, GENERATE MISSING VAL
[10]   Estimating Classification Errors Under Edit Restrictions in Composite Survey-Register Data Using Multiple Imputation Latent Class Modelling (MILC) [J].
Boeschoten, Laura ;
Oberski, Daniel ;
de Waal, Ton .
JOURNAL OF OFFICIAL STATISTICS, 2017, 33 (04) :921-962