A comparison of imputation techniques for handling missing data

被引:175
作者
Musil, CM [1 ]
Warner, CB
Yobas, PK
Jones, SL
机构
[1] Case Western Reserve Univ, Frances Payne Bolton Sch Nursing, Dept Sociol, Cleveland, OH 44106 USA
[2] Mahidol Univ, Fac Nursing, Dept Mental Hlth & Psychiat Nursing, Bangkok 10700, Thailand
[3] Kent State Univ, Coll Nursing, Kent, OH 44242 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1177/019394502762477004
中图分类号
R47 [护理学];
学科分类号
1011 ;
摘要
Researchers are commonly faced with the problem of missing data. This article presents theoretical and empirical information for the selection and application of approaches for handling missing data on a single variable. An actual data set of 492 cases with no missing values was used to create a simulated yet realistic data set with missing at random (MAR) data. The authors compare and contrast five approaches (listwise. deletion, mean substitution, simple regression, regression with an error term, and the expectation maximization [EM] algorithm) for dealing with missing data, and compare the effects of each method on descriptive statistics and correlation coefficients for the imputed data (n = 96) and the entire sample (n = 492) when imputed data are included. All methods had limitations, although our findings suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.
引用
收藏
页码:815 / 829
页数:15
相关论文
共 29 条
  • [1] Allison P., 2002, MISSING DATA
  • [2] [Anonymous], 1997, FAMILY SCI REV
  • [3] BOURQUE LB, 1992, PROCESSING DATA SURV
  • [4] Byrne B, 2010, INTERNATIONAL HANDBOOK OF PSYCHOLOGY IN EDUCATION, P3
  • [5] Cohen J., 1983, APPL MULTIPLE REGRES, DOI [10.1002/0471264385.wei0219, DOI 10.1002/0471264385.WEI0219]
  • [6] DEROGATIS LR, 1977, J CLIN PSYCHOL, V33, P981, DOI 10.1002/1097-4679(197710)33:4<981::AID-JCLP2270330412>3.0.CO
  • [7] 2-0
  • [8] Fox C, 1996, J Health Soc Policy, V8, P39, DOI 10.1300/J045v08n01_04
  • [9] Hair J. F., 2010, Multivariate Data Analysis
  • [10] SELF-CARE AMONG OLDER ADULTS
    HAUG, MR
    WYKLE, ML
    NAMAZI, KH
    [J]. SOCIAL SCIENCE & MEDICINE, 1989, 29 (02) : 171 - 183