Principled missing data methods for researchers

被引:1416
作者
Dong, Yiran [1 ]
Peng, Chao-Ying Joanne [1 ]
机构
[1] Indiana Univ, Bloomington, IN 47405 USA
来源
SPRINGERPLUS | 2013年 / 2卷
关键词
Missing data; Listwise deletion; MI; Gamma IML; EM; MAR; MCAR; MNAR; MULTIPLE IMPUTATION; MAXIMUM-LIKELIHOOD; CHAINED EQUATIONS; PERFORMANCE; SOFTWARE; UPDATE; VALUES; STATE;
D O I
10.1186/2193-1801-2-222
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The impact of missing data on quantitative research can be serious, leading to biased estimates of parameters, loss of information, decreased statistical power, increased standard errors, and weakened generalizability of findings. In this paper, we discussed and demonstrated three principled missing data methods: multiple imputation, full information maximum likelihood, and expectation-maximization algorithm, applied to a real-world data set. Results were contrasted with those obtained from the complete data set and from the listwise deletion method. The relative merits of each method are noted, along with common features they share. The paper concludes with an emphasis on the importance of statistical assumptions, and recommendations for researchers. Quality of research will be enhanced if (a) researchers explicitly acknowledge missing data problems and the conditions under which they occurred, (b) principled methods are employed to handle missing data, and (c) the appropriate treatment of missing data is incorporated into review standards of manuscripts submitted for publication.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [31] Why Missing Data Matter in the Longitudinal Study of Adolescent Development: Using the 4-H Study to Understand the Uses of Different Missing Data Methods
    Helena Jeličić
    Erin Phelps
    Richard M. Lerner
    Journal of Youth and Adolescence, 2010, 39 : 816 - 835
  • [32] Why Missing Data Matter in the Longitudinal Study of Adolescent Development: Using the 4-H Study to Understand the Uses of Different Missing Data Methods
    Jelicic, Helena
    Phelps, Erin
    Lerner, Richard M.
    JOURNAL OF YOUTH AND ADOLESCENCE, 2010, 39 (07) : 816 - 835
  • [33] Multivariate missing data in hydrology - Review and applications
    Ben Aissia, Mohamed-Aymen
    Chebana, Fateh
    Ouarda, Taha B. M. J.
    ADVANCES IN WATER RESOURCES, 2017, 110 : 299 - 309
  • [34] How to generate missing data for simulation studies
    Zhang, Xijuan
    QUANTITATIVE METHODS FOR PSYCHOLOGY, 2023, 19 (02): : 100 - 122
  • [35] Estimating Incremental Validity Under Missing Data
    Fife, Dustin A.
    Mendoza, Jorge L.
    Berry, Christopher M.
    MULTIVARIATE BEHAVIORAL RESEARCH, 2017, 52 (02) : 164 - 177
  • [36] Missing Data Approaches in eHealth Research: Simulation Study and a Tutorial for Nonmathematically Inclined Researchers
    Blankers, Matthijs
    Koeter, Maarten W. J.
    Schippers, Gerard M.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2010, 12 (05) : e54p.1 - e54p.11
  • [37] Missing data in principal component analysis of questionnaire data: a comparison of methods
    Van Ginkel, Joost R.
    Kroonenberg, Pieter M.
    Kiers, Henk A. L.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2014, 84 (11) : 2298 - 2315
  • [38] Missing Data in Alcohol Clinical Trials: A Comparison of Methods
    Hallgren, Kevin A.
    Witkiewitz, Katie
    ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 2013, 37 (12) : 2152 - 2160
  • [39] Missing Network Data A Comparison of Different Imputation Methods
    Krause, Robert W.
    Huisman, Mark
    Steglich, Christian
    Snijders, Tom A. B.
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 159 - 163
  • [40] Evaluating Imputation Methods for Missing Data in a MCI Dataset
    Gomez-Valades Batanero, Alba
    Rincon Zamorano, Mariano
    Martinez Tomas, Rafael
    Guerrero Martin, Juan
    ARTIFICIAL INTELLIGENCE IN NEUROSCIENCE: AFFECTIVE ANALYSIS AND HEALTH APPLICATIONS, PT I, 2022, 13258 : 446 - 454