Multiple Imputation with Predictive Mean Matching Method for Numerical Missing Data

被引:10
作者
Akmam, Emha Fathul [1 ]
Siswantining, Titin [1 ]
Soemartojo, Saskya Mary [1 ]
Sarwinda, Devvi [1 ]
机构
[1] Univ Indonesia, Dept Math, Fac Math & Nat Sci, Depok, Indonesia
来源
2019 3RD INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS 2019) | 2019年
关键词
linear regression analysis; multiple imputation; missing values; predictive mean matching;
D O I
10.1109/icicos48119.2019.8982510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing data are condition when there are some missing values or empty entries on several observations on data. It could inhibit statistical analysis process and might give a bias conclusion from the analysis if couldn't be handled properly. This problem can be found on some linear regression analysis. One way to handle this problem is using multiple imputation (MI) method named Predictive Mean Matching (PMM). PMM will matching the predictive mean distance of incomplete observations with the complete observations. To get the multiple imputation concept, the predictive mean of incomplete observations were estimated by Bayesian approach while the complete observations were estimated with ordinary least square. Thus, the complete observation that has the closest distance will be a donor value for the incomplete one. Simulation data with two variable (x and y), univariate missing data pattern (on y), and MAR mechanism is used to analyzed the effectiveness of PMM based on relative efficiency estimation result of missing covariate data. Regression analysis used x as independent variable and y as dependent variable. The result showed that PMM give a significant coefficient regression parameter at 5% level of significance and only loss 1% of relative efficiency.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] A new iterative fuzzy clustering algorithm for multiple imputation of missing data
    Nikfalazar, Sanaz
    Yeh, Chung-Hsing
    Bedingfield, Susan
    Khorshidi, Hadi A.
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [22] From Predictive Methods to Missing Data Imputation: An Optimization Approach
    Bertsimas, Dimitris
    Pawlowski, Colin
    Zhuo, Ying Daisy
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [23] Multiple Imputation of Missing Data in Educational Production Functions
    Elasra, Amira
    COMPUTATION, 2022, 10 (04)
  • [24] Multiple Imputation for Missing Data in Life Cycle Inventory
    Liu, Yu
    Gong, Xianzheng
    Wang, ZhiHong
    Liu, Wei
    Nie, Zuoren
    MATERIALS RESEARCH, PTS 1 AND 2, 2009, 610-613 : 21 - 27
  • [25] A nonparametric multiple imputation approach for missing categorical data
    Zhou, Muhan
    He, Yulei
    Yu, Mandi
    Hsu, Chiu-Hsieh
    BMC MEDICAL RESEARCH METHODOLOGY, 2017, 17
  • [26] Missing data and multiple imputation in clinical epidemiological research
    Pedersen, Alma B.
    Mikkelsen, Ellen M.
    Cronin-Fenton, Deirdre
    Kristensen, Nickolaj R.
    Tra My Pham
    Pedersen, Lars
    Petersen, Irene
    CLINICAL EPIDEMIOLOGY, 2017, 9 : 157 - 165
  • [27] Multiple imputation of unordered categorical missing data: A comparison of the multivariate normal imputation and multiple imputation by chained equations
    Karangwa, Innocent
    Kotze, Danelle
    Blignaut, Renette
    BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2016, 30 (04) : 521 - 539
  • [28] Multiple Imputation of Missing Composite Outcomes in Longitudinal Data
    O’Keeffe A.G.
    Farewell D.M.
    Tom B.D.M.
    Farewell V.T.
    Statistics in Biosciences, 2016, 8 (2) : 310 - 332
  • [29] A nonparametric multiple imputation approach for missing categorical data
    Muhan Zhou
    Yulei He
    Mandi Yu
    Chiu-Hsieh Hsu
    BMC Medical Research Methodology, 17
  • [30] A practical guide to multiple imputation of missing data in nephrology
    Blazek, Katrina
    van Zwieten, Anita
    Saglimbene, Valeria
    Teixeira-Pinto, Armando
    KIDNEY INTERNATIONAL, 2021, 99 (01) : 68 - 74