Performance analysis of supervised classification models on heart disease prediction

被引:8
|
作者
Ogundepo, Ezekiel Adebayo [1 ]
Yahya, Waheed Babatunde [1 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
关键词
Classifiers; Model selection; Feature selection; Exploratory data analysis; Evaluation metrics;
D O I
10.1007/s11334-022-00524-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents a predictive analysis of data on heart disease patients to determine the possible risk factors associated with their heart disease status. Two independent (but similar) published heart disease datasets, the Cleveland data (used to build classification models) and the Statlog data (used for results' validation), were considered for analysis. A detailed exploratory analysis using the Chi-square test of independence was performed on the Cleveland data after which ten standard classification models were trained for class prediction. The classification models were built by partitioning the Cleveland data randomly into 208 (70%) training samples and 89 (30%) test samples over 200 replications. Preliminary results showed that some of the bio-clinical categorical variables are strongly associated with the heart disease conditions of the patients (p < 0.001). The classification results from the test samples indicated that the support vector machine yielded the best predictive performances with 85% accuracy, 82% sensitivity, 88% specificity, 87% precision, 91% area under the ROC curve, and 38% log loss value. These results were validated on the Statlog data in tenfold cross-validation which were all consistent with those obtained from the Cleveland dataset.
引用
收藏
页码:129 / 144
页数:16
相关论文
共 50 条
  • [21] The performance of risk prediction models
    Gerds, Thomas A.
    Cai, Tianxi
    Schumacher, Martin
    BIOMETRICAL JOURNAL, 2008, 50 (04) : 457 - 479
  • [22] CPSSDS: Conformal prediction for semi-supervised classification on data streams
    Tanha, Jafar
    Samadi, Negin
    Abdi, Yousef
    Razzaghi-Asl, Nazila
    INFORMATION SCIENCES, 2022, 584 : 212 - 234
  • [23] FEATURE SELECTION FOR INTER-PATIENT SUPERVISED HEART BEAT CLASSIFICATION
    Doquire, G.
    de Lannoy, G.
    Francois, D.
    Verleysen, M.
    BIOSIGNALS 2011, 2011, : 67 - 73
  • [24] Neighborhood component analysis and support vector machines for heart disease prediction
    Djerioui M.
    Brik Y.
    Ladjal M.
    Attallah B.
    Ingenierie des Systemes d'Information, 2019, 24 (06): : 591 - 595
  • [25] Microbiome Classification for Heart Disease Detection
    Hodzic, Aisha
    Oudah, Mai
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2022), 2022, : 237 - 242
  • [26] A New Avenue for Classification and Prediction of Olive Cultivars Using Supervised and Unsupervised Algorithms
    Beiki, Amir H.
    Saboor, Saba
    Ebrahimi, Mansour
    PLOS ONE, 2012, 7 (09):
  • [27] Comparing different supervised machine learning algorithms for disease prediction
    Uddin, Shahadat
    Khan, Arif
    Hossain, Md Ekramul
    Moni, Mohammad Ali
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [28] Prediction analysis for Parkinson disease using multiple feature selection & classification methods
    Hema M.S.
    Maheshprabhu R.
    Reddy K.S.
    Guptha M.N.
    Pandimurugan V.
    Multimedia Tools and Applications, 2023, 82 (27) : 42995 - 43012
  • [29] Student performance prediction using datamining classification algorithms: Evaluating generalizability of models from geographical aspect
    Parhizkar, Amirmohammad
    Tejeddin, Golnaz
    Khatibi, Toktam
    EDUCATION AND INFORMATION TECHNOLOGIES, 2023, 28 (11) : 14167 - 14185
  • [30] An extensive experimental analysis for heart disease prediction using artificial intelligence techniques
    Rohan, D.
    Reddy, G. Pradeep
    Kumar, Y. V. Pavan
    Prakash, K. Purna
    Reddy, Ch. Pradeep
    SCIENTIFIC REPORTS, 2025, 15 (01):