Improving the Prediction of Heart Failure Patients' Survival Using SMOTE and Effective Data Mining Techniques

被引:164
|
作者
Ishaq, Abid [1 ]
Sadiq, Saima [1 ]
Umer, Muhammad [1 ,4 ]
Ullah, Saleem [1 ]
Mirjalili, Seyedali [2 ,3 ,5 ]
Rupapara, Vaibhav [6 ]
Nappi, Michele [7 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan 64200, Pakistan
[2] Torrens Univ Australia, Ctr Artificial Intelligence Res & Optimizat, Brisbane, Qld 4006, Australia
[3] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
[4] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur 63100, Pakistan
[5] King Abdulaziz Univ, Jeddah 21589, Saudi Arabia
[6] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[7] Univ Salerno, Dept Comp Sci, I-84084 Fisciano, Italy
关键词
Heart; Data mining; Predictive models; Machine learning algorithms; Support vector machines; Medical diagnostic imaging; Boosting; heart disease classification; machine learning; cardiovascular disease; feature selection; SMOTE;
D O I
10.1109/ACCESS.2021.3064084
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient's survivor prediction. To predict patient's survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient's survival.
引用
收藏
页码:39707 / 39716
页数:10
相关论文
共 50 条
  • [41] Prediction of Hospital Charges for the Cancer Patients with Data Mining Techniques
    Kang, Jin Oh
    Chung, Suk-Hoon
    Suh, Yong-Moo
    HEALTHCARE INFORMATICS RESEARCH, 2009, 15 (01) : 13 - 23
  • [42] Prediction of Crop Production in India Using Data Mining Techniques
    Jambekar, Suvidha
    Nema, Shikha
    Saquib, Zia
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [43] Breast Cancer Prediction Using Data Mining Classification Techniques
    Kazi, Abdul Karim
    Waseemullah
    Baig, Mirza Adnan
    Khan, Shahzaib
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (09): : 696 - 704
  • [44] Customer Churn Prediction Model using Data Mining techniques
    Mitkees, Ibrahim M. M.
    Badr, Sherif M.
    ElSeddawy, Ahmed Ibrahim Bahgat
    2017 13TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2017, : 262 - 268
  • [45] A Review on Consumer Behavior Prediction using Data Mining Techniques
    Kareena
    Kapoor, Nitika
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 1089 - 1093
  • [46] Prediction of Traffic-Violation Using Data Mining Techniques
    Amiruzzaman, Md
    PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2018, VOL 1, 2019, 880 : 283 - 297
  • [47] Rainfall Prediction in Lahore City using Data Mining Techniques
    Aftab, Shabib
    Ahmad, Munir
    Hameed, Noureen
    Bashir, Muhammad Salman
    Ali, Iftikhar
    Nawaz, Zahid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (04) : 254 - 260
  • [48] Prediction of Survival in Thyroid Cancer Using Data Mining Technique
    Jajroudi, M.
    Baniasadi, T.
    Kamkar, L.
    Arbabi, F.
    Sanei, M.
    Ahmadzade, M.
    TECHNOLOGY IN CANCER RESEARCH & TREATMENT, 2014, 13 (04) : 353 - 359
  • [49] Suicide Prediction in Twitter Data using Mining Techniques: A Survey
    Kumar, E. Rajesh
    Rao, A. K. V. S. N. Rama
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT SUSTAINABLE SYSTEMS (ICISS 2019), 2019, : 122 - 131
  • [50] Improving risk prediction in heart failure using machine learning
    Adler, Eric D.
    Voors, Adriaan A.
    Klein, Liviu
    Macheret, Fima
    Braun, Oscar O.
    Urey, Marcus A.
    Zhu, Wenhong
    Sama, Iziah
    Tadel, Matevz
    Campagnari, Claudio
    Greenberg, Barry
    Yagil, Avi
    EUROPEAN JOURNAL OF HEART FAILURE, 2020, 22 (01) : 139 - 147