Improving the Prediction of Heart Failure Patients' Survival Using SMOTE and Effective Data Mining Techniques

被引:164
|
作者
Ishaq, Abid [1 ]
Sadiq, Saima [1 ]
Umer, Muhammad [1 ,4 ]
Ullah, Saleem [1 ]
Mirjalili, Seyedali [2 ,3 ,5 ]
Rupapara, Vaibhav [6 ]
Nappi, Michele [7 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan 64200, Pakistan
[2] Torrens Univ Australia, Ctr Artificial Intelligence Res & Optimizat, Brisbane, Qld 4006, Australia
[3] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
[4] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur 63100, Pakistan
[5] King Abdulaziz Univ, Jeddah 21589, Saudi Arabia
[6] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[7] Univ Salerno, Dept Comp Sci, I-84084 Fisciano, Italy
关键词
Heart; Data mining; Predictive models; Machine learning algorithms; Support vector machines; Medical diagnostic imaging; Boosting; heart disease classification; machine learning; cardiovascular disease; feature selection; SMOTE;
D O I
10.1109/ACCESS.2021.3064084
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient's survivor prediction. To predict patient's survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient's survival.
引用
收藏
页码:39707 / 39716
页数:10
相关论文
共 50 条
  • [31] Identification of noteworthy features and data mining techniques for heart disease prediction
    Kumar, Parvathaneni Rajendra
    Ravichandran, Suban
    Narayana, S.
    INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2024,
  • [32] Improving Road Safety in India Using Data Mining Techniques
    Gaurav
    Alam, Zunaid
    DATA SCIENCE AND ANALYTICS, 2018, 799 : 187 - 194
  • [33] A Simple Acute Myocardial Infarction (Heart Attack) Prediction System Using Clinical Data and Data Mining Techniques
    Nag, Procheta
    Mondal, Saikat
    Ahmed, Foysal
    More, Arun
    Raihan, M.
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [34] Improving The Retailers Profit For CRM Using Data Mining Techniques
    Deepa, K.
    Dhanabal, S.
    Kaliappan, Vishnu Kumar
    2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 208 - 210
  • [35] Compressive strength prediction of CFRP confined concrete using data mining techniques
    Camoes, Aires
    Martins, Francisco F.
    COMPUTERS AND CONCRETE, 2017, 19 (03) : 233 - 241
  • [36] A review on prediction of diabetes using machine learning and data mining classification techniques
    Pati, Abhilash
    Parhi, Manoranjan
    Pattanayak, Binod Kumar
    INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2023, 41 (01) : 83 - 109
  • [37] Improving the Heart Disease Detection and Patients' Survival Using Supervised Infinite Feature Selection and Improved Weighted Random Forest
    Abdellatif, Abdallah
    Abdellatef, Hamdan
    Kanesan, Jeevan
    Chow, Chee-Onn
    Chuah, Joon Huang
    Gheni, Hassan Muwafaq
    IEEE ACCESS, 2022, 10 : 67363 - 67372
  • [38] Failure prediction of Indian Banks using SMOTE, Lasso regression, bagging and boosting
    Shrivastava, Santosh
    Jeyanthi, P. Mary
    Singh, Sarbjit
    COGENT ECONOMICS & FINANCE, 2020, 8 (01):
  • [39] Imbalanced Learning in Heart Disease Categorization: Improving Minority Class Prediction Accuracy Using the SMOTE Algorithm
    Aryuni, Mediana
    Adiarto, Suko
    Miranda, Eka
    Madyatmadja, Evaristus Didik
    Sano, Albert Verasius Dian
    Sestomi, Elvin
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2023, 23 (02) : 140 - 151
  • [40] Survival Prediction of Heart Failure Patients using Stacked Ensemble Machine Learning Algorithm
    Zaman, S. M. Mehedi
    Qureshi, Wasay Mahmood
    Raihan, Md Mohsin Sarker
    Bin Shams, Abdullah
    Sultana, Sharmin
    2021 IEEE INTERNATIONAL WOMEN IN ENGINEERING (WIE) CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2022, : 117 - 120