Performance analysis of supervised classification models on heart disease prediction

被引:8
|
作者
Ogundepo, Ezekiel Adebayo [1 ]
Yahya, Waheed Babatunde [1 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
关键词
Classifiers; Model selection; Feature selection; Exploratory data analysis; Evaluation metrics;
D O I
10.1007/s11334-022-00524-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents a predictive analysis of data on heart disease patients to determine the possible risk factors associated with their heart disease status. Two independent (but similar) published heart disease datasets, the Cleveland data (used to build classification models) and the Statlog data (used for results' validation), were considered for analysis. A detailed exploratory analysis using the Chi-square test of independence was performed on the Cleveland data after which ten standard classification models were trained for class prediction. The classification models were built by partitioning the Cleveland data randomly into 208 (70%) training samples and 89 (30%) test samples over 200 replications. Preliminary results showed that some of the bio-clinical categorical variables are strongly associated with the heart disease conditions of the patients (p < 0.001). The classification results from the test samples indicated that the support vector machine yielded the best predictive performances with 85% accuracy, 82% sensitivity, 88% specificity, 87% precision, 91% area under the ROC curve, and 38% log loss value. These results were validated on the Statlog data in tenfold cross-validation which were all consistent with those obtained from the Cleveland dataset.
引用
收藏
页码:129 / 144
页数:16
相关论文
共 50 条
  • [31] Student performance prediction using datamining classification algorithms: Evaluating generalizability of models from geographical aspect
    Amirmohammad Parhizkar
    Golnaz Tejeddin
    Toktam Khatibi
    Education and Information Technologies, 2023, 28 : 14167 - 14185
  • [32] Optimized Nonlinear Discriminant Analysis (ONDA) for Supervised Pixel Classification
    Guo, Jia
    Huang, Hu
    Chen, Cheng
    Rohde, Gustavo K.
    IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (12) : 1155 - 1158
  • [33] Analysis of Supervised Text Classification Algorithms on Corporate Sustainability Reports
    Shahi, Amir Mohammad
    Issac, Biju
    Modapothala, Jashua Rajesh
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 96 - 100
  • [34] Ensemble Feature Selection for Heart Disease Classification
    Benhar, Houda
    Idri, Ali
    Hosni, Mohamed
    HEALTHINF: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 5: HEALTHINF, 2021, : 369 - 376
  • [35] Stabilizing l1-norm prediction models by supervised feature grouping
    Kamkar, Iman
    Gupta, Sunil Kumar
    Phung, Dinh
    Venkatesh, Svetha
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 59 : 149 - 168
  • [36] On the integration of similarity measures with machine learning models to enhance text classification performance
    Abdalla, Hassan I.
    Amer, Ali A.
    INFORMATION SCIENCES, 2022, 614 : 263 - 288
  • [37] Comparative analysis of supervised learning algorithms for prediction of cardiovascular diseases
    Dou, Yifeng
    Liu, Jiantao
    Meng, Wentao
    Zhang, Yingchao
    TECHNOLOGY AND HEALTH CARE, 2024, 32 : S241 - S251
  • [38] A dynamic performance evaluation of distress prediction models
    Mousavi, Mohammad Mahdi
    Ouenniche, Jamal
    Tone, Kaoru
    JOURNAL OF FORECASTING, 2023, 42 (04) : 756 - 784
  • [39] Machine learning classification approach for asthma prediction models in children
    Ekpo, Raphael Henshaw
    Osamor, Victor Chukwudi
    Azeta, Ambrose A.
    Ikeakanam, Excellent
    Amos, Beatrice Opeyemi
    HEALTH AND TECHNOLOGY, 2023, 13 (1) : 1 - 10
  • [40] SSD Failure Prediction Based on Classification Models and Data Engineering
    Wang, Ziyao
    Xu, Jie
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 523 - 530