Performance analysis of supervised classification models on heart disease prediction

被引:8
|
作者
Ogundepo, Ezekiel Adebayo [1 ]
Yahya, Waheed Babatunde [1 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
关键词
Classifiers; Model selection; Feature selection; Exploratory data analysis; Evaluation metrics;
D O I
10.1007/s11334-022-00524-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents a predictive analysis of data on heart disease patients to determine the possible risk factors associated with their heart disease status. Two independent (but similar) published heart disease datasets, the Cleveland data (used to build classification models) and the Statlog data (used for results' validation), were considered for analysis. A detailed exploratory analysis using the Chi-square test of independence was performed on the Cleveland data after which ten standard classification models were trained for class prediction. The classification models were built by partitioning the Cleveland data randomly into 208 (70%) training samples and 89 (30%) test samples over 200 replications. Preliminary results showed that some of the bio-clinical categorical variables are strongly associated with the heart disease conditions of the patients (p < 0.001). The classification results from the test samples indicated that the support vector machine yielded the best predictive performances with 85% accuracy, 82% sensitivity, 88% specificity, 87% precision, 91% area under the ROC curve, and 38% log loss value. These results were validated on the Statlog data in tenfold cross-validation which were all consistent with those obtained from the Cleveland dataset.
引用
收藏
页码:129 / 144
页数:16
相关论文
共 50 条
  • [41] Analysis of Classification Algorithms for Breast Cancer Prediction
    Rajamohana, S. P.
    Umamaheswari, K.
    Karunya, K.
    Deepika, R.
    DATA MANAGEMENT, ANALYTICS AND INNOVATION, ICDMAI 2019, VOL 1, 2020, 1042 : 517 - 528
  • [42] Survey, classification and critical analysis of the literature on corporate bankruptcy and financial distress prediction
    Zhao, Jinxian
    Ouenniche, Jamal
    De Smedt, Johannes
    MACHINE LEARNING WITH APPLICATIONS, 2024, 15
  • [43] A diabetic disease prediction model based on classification algorithms
    Ahuja R.
    Sharma S.C.
    Ali M.
    Annals of Emerging Technologies in Computing, 2019, 3 (03) : 44 - 52
  • [44] Supervised machine learning based gait classification system for early detection and stage classification of Parkinson's disease
    Balaji, E.
    Brindha, D.
    Balakrishnan, R.
    APPLIED SOFT COMPUTING, 2020, 94
  • [45] Methods of performance evaluation for the supervised classification of satellite imagery in determining land cover classes
    Wachholz de Souza, Carlos H.
    Mercante, Erivelto
    Prudente, Victor H. R.
    Justina, Diego D. D.
    CIENCIA E INVESTIGACION AGRARIA, 2013, 40 (02): : 419 - 428
  • [46] Child Cry Classification - An Analysis of Features and Models
    Kulkarni, Prathamesh
    Umarani, Sarthak
    Diwan, Vaishnavi
    Korde, Vishakha
    Rege, Priti P.
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [47] Assessment of performance of survival prediction models for cancer prognosis
    Chen, Hung-Chia
    Kodell, Ralph L.
    Cheng, Kuang Fu
    Chen, James J.
    BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
  • [48] HSLE: A Hybrid Ensemble Classifier for Prediction of Heart Disease
    Kushwaha, Pradeep Kumar
    Dagur, Arvind
    Shukla, Dhirendra
    RECENT ADVANCES IN ELECTRICAL & ELECTRONIC ENGINEERING, 2024,
  • [49] Impacts of crisis on SME bankruptcy prediction models' performance
    Papik, Mario
    Papikova, Lenka
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214
  • [50] Lung Cancer Classification Using Genetic Algorithm to Optimize Prediction Models
    Diaz, Joey Mark
    Pinon, Raymond Christopher
    Solano, Geoffrey
    5TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS, IISA 2014, 2014, : 151 - +