Performance Analysis of Various Missing Value Imputation Methods on Heart Failure Dataset

被引:2
作者
Al Khaldy, Mohammad [1 ]
Kambhampati, Chandrasekhar [1 ]
机构
[1] Univ Hull, Dept Comp Sci, Kingston Upon Hull, N Humberside, England
来源
PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2 | 2018年 / 16卷
关键词
Heart failure; Decision tree; J48; REPTree; Random forest; EM; Most common; CMC; KNN; K-mean; SVM; SOFTWARE TOOL; CLASSIFICATION; ALGORITHMS; KEEL;
D O I
10.1007/978-3-319-56991-8_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The missing data issue is a fundamental challenge in terms of analyses and classification of data. The classification performance of incomplete data could be affected and produce different accuracy results compared with complete data. In this work we compare six scalable imputation methods, implemented on a Heart Failure dataset. The comparison is done by the performance metrics of three different classification methods namely J48, REPTree, and Random Forest. The aim of the research is to find a classifier that achieves best performance results after imputing the missing data using different imputation methods. The results show that in general, the Random Forest classification achieves the best results in comparison to the decision tree J48 and REP Tree. Furthermore, the performance of classification improved when imputing the missing values by concept most common (CMC) and support vector machine (SVM).
引用
收藏
页码:415 / 425
页数:11
相关论文
共 43 条
  • [1] Agrawal GL., 2013, Int. J. Emerging Technol. Adv. Eng, V3, P341
  • [2] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [3] Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
  • [4] Almutairi A, 2014, INT CONF INTERNET, P223, DOI 10.1109/ICITST.2014.7038810
  • [5] [Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278
  • [6] [Anonymous], ARXIV07043474
  • [7] Balasundaram Arthi, 2013, IET Chennai Fourth International Conference on Sustainable Energy and Intelligent Systems (SEISCON 2013), P390
  • [8] Batista GEAPA, 2003, APPL ARTIF INTELL, V17, P519, DOI 10.1080/08839510390219309
  • [9] Handling missing values in kernel methods with application to microbiology data
    Belanche, Lluis A.
    Kobayashi, Vladimer
    Aluja, Tomas
    [J]. NEUROCOMPUTING, 2014, 141 : 110 - 116
  • [10] Carmona C. J., 2012, 2012 IEEE INT C FUZZ, P1