Improving the Prediction of Heart Failure Patients' Survival Using SMOTE and Effective Data Mining Techniques

被引:164
|
作者
Ishaq, Abid [1 ]
Sadiq, Saima [1 ]
Umer, Muhammad [1 ,4 ]
Ullah, Saleem [1 ]
Mirjalili, Seyedali [2 ,3 ,5 ]
Rupapara, Vaibhav [6 ]
Nappi, Michele [7 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan 64200, Pakistan
[2] Torrens Univ Australia, Ctr Artificial Intelligence Res & Optimizat, Brisbane, Qld 4006, Australia
[3] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
[4] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur 63100, Pakistan
[5] King Abdulaziz Univ, Jeddah 21589, Saudi Arabia
[6] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[7] Univ Salerno, Dept Comp Sci, I-84084 Fisciano, Italy
关键词
Heart; Data mining; Predictive models; Machine learning algorithms; Support vector machines; Medical diagnostic imaging; Boosting; heart disease classification; machine learning; cardiovascular disease; feature selection; SMOTE;
D O I
10.1109/ACCESS.2021.3064084
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient's survivor prediction. To predict patient's survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient's survival.
引用
收藏
页码:39707 / 39716
页数:10
相关论文
共 50 条
  • [1] Effective Prediction of Type II Diabetes Mellitus Using Data Mining Classifiers and SMOTE
    Shuja, Mirza
    Mittal, Sonu
    Zaman, Majid
    ADVANCES IN COMPUTING AND INTELLIGENT SYSTEMS, ICACM 2019, 2020, : 195 - 211
  • [2] Prediction of Heart Attacks using Data Mining Techniques
    Abdelghani, Bassam A.
    Fadal, Sophia
    Bedoor, Shadi
    Banitaan, Shadi
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 951 - 956
  • [3] HEART DISEASE PREDICTION USING DATA MINING TECHNIQUES
    Rairikar, Abhishek
    Kulkarni, Vedant
    Sabale, Vikas
    Kale, Harshavardhan
    Lamgunde, Anuradha
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [4] Current State of the Art for Survival Prediction in Cancer Using Data Mining Techniques
    Doja, M. N.
    Kaur, Ishleen
    Ahmad, Tanvir
    CURRENT BIOINFORMATICS, 2020, 15 (03) : 174 - 186
  • [5] Prediction of Heart Disease Using Classification Based Data Mining Techniques
    Joshi, Sujata
    Nair, Mydhili K.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 2, 2015, 32 : 503 - 511
  • [6] Stock Market Prediction using Data Mining Techniques
    Maini, Sahaj Singh
    Govinda, K.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SUSTAINABLE SYSTEMS (ICISS 2017), 2017, : 654 - 661
  • [7] Prediction of Effective Rainfall and Crop Water Needs using Data Mining Techniques
    Abishek, B.
    Eswar, Akash M.
    Priyatharshini, R.
    Deepika, P.
    2017 IEEE TECHNOLOGICAL INNOVATIONS IN ICT FOR AGRICULTURE AND RURAL DEVELOPMENT (TIAR), 2017, : 231 - 235
  • [8] Analysis of Data Mining Techniques for Heart Disease Prediction
    Sultana, Marjia
    Haider, Afrin
    Uddin, Mohammad Shorif
    2016 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION & COMMUNICATION TECHNOLOGY (ICEEICT), 2016,
  • [9] Crime Prediction on Open Data in India Using Data Mining Techniques
    Menaka, M.
    Sujatha, P.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [10] Intelligent Stock Data Prediction using Predictive Data Mining Techniques
    Kumar, Pankaj
    Bala, Anju
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 743 - 747