Multiple Imputation with Predictive Mean Matching Method for Numerical Missing Data

被引:10
作者
Akmam, Emha Fathul [1 ]
Siswantining, Titin [1 ]
Soemartojo, Saskya Mary [1 ]
Sarwinda, Devvi [1 ]
机构
[1] Univ Indonesia, Dept Math, Fac Math & Nat Sci, Depok, Indonesia
来源
2019 3RD INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS 2019) | 2019年
关键词
linear regression analysis; multiple imputation; missing values; predictive mean matching;
D O I
10.1109/icicos48119.2019.8982510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing data are condition when there are some missing values or empty entries on several observations on data. It could inhibit statistical analysis process and might give a bias conclusion from the analysis if couldn't be handled properly. This problem can be found on some linear regression analysis. One way to handle this problem is using multiple imputation (MI) method named Predictive Mean Matching (PMM). PMM will matching the predictive mean distance of incomplete observations with the complete observations. To get the multiple imputation concept, the predictive mean of incomplete observations were estimated by Bayesian approach while the complete observations were estimated with ordinary least square. Thus, the complete observation that has the closest distance will be a donor value for the incomplete one. Simulation data with two variable (x and y), univariate missing data pattern (on y), and MAR mechanism is used to analyzed the effectiveness of PMM based on relative efficiency estimation result of missing covariate data. Regression analysis used x as independent variable and y as dependent variable. The result showed that PMM give a significant coefficient regression parameter at 5% level of significance and only loss 1% of relative efficiency.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Multiple imputation of binary multilevel missing not at random data
    Hammon, Angelina
    Zinn, Sabine
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2020, 69 (03) : 547 - 564
  • [32] Multiple Imputation of Missing Phenotype Data for QTL Mapping
    Bobb, Jennifer F.
    Scharfstein, Daniel O.
    Daniels, Michael J.
    Collins, Francis S.
    Kelada, Samir
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01):
  • [33] Multiple Imputation Ensembles (MIE) for Dealing with Missing Data
    Aleryani A.
    Wang W.
    de la Iglesia B.
    [J]. SN Computer Science, 2020, 1 (3)
  • [34] A practical guide to multiple imputation of missing data in nephrology
    Blazek, Katrina
    van Zwieten, Anita
    Saglimbene, Valeria
    Teixeira-Pinto, Armando
    [J]. KIDNEY INTERNATIONAL, 2021, 99 (01) : 68 - 74
  • [35] Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time
    Grover, Gurprit
    Gupta, Vinay K.
    [J]. JOURNAL OF APPLIED STATISTICS, 2015, 42 (04) : 817 - 827
  • [36] A new multiple imputation method for bounded missing values
    Kwon, Tae Yeon
    Park, Yousung
    [J]. STATISTICS & PROBABILITY LETTERS, 2015, 107 : 204 - 209
  • [37] Missing Values Imputation using Similarity Matching Method for Brainprint Authentication
    Liew, Siaw-Hong
    Choo, Yun-Huoy
    Low, Yin Fen
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (10) : 364 - 370
  • [38] A Modified Imputation Method to Missing Data as a Preprocessing Technique
    Caparino, Elenita T.
    Sison, Ariel M.
    Medina, Ruji P.
    [J]. 2018 IEEE 10TH INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY, COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2018,
  • [39] A robust missing value imputation method for noisy data
    Zhu, Bing
    He, Changzheng
    Liatsis, Panos
    [J]. APPLIED INTELLIGENCE, 2012, 36 (01) : 61 - 74
  • [40] Analysing Mark–Recapture–Recovery Data in the Presence of Missing Covariate Data Via Multiple Imputation
    Hannah Worthington
    Ruth King
    Stephen T. Buckland
    [J]. Journal of Agricultural, Biological, and Environmental Statistics, 2015, 20 : 28 - 46