Performance analysis of supervised classification models on heart disease prediction

被引:0
作者
Ezekiel Adebayo Ogundepo
Waheed Babatunde Yahya
机构
[1] University of Ilorin,Department of Statistics
来源
Innovations in Systems and Software Engineering | 2023年 / 19卷
关键词
Classifiers; Model selection; Feature selection; Exploratory data analysis; Evaluation metrics;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents a predictive analysis of data on heart disease patients to determine the possible risk factors associated with their heart disease status. Two independent (but similar) published heart disease datasets, the Cleveland data (used to build classification models) and the Statlog data (used for results’ validation), were considered for analysis. A detailed exploratory analysis using the Chi-square test of independence was performed on the Cleveland data after which ten standard classification models were trained for class prediction. The classification models were built by partitioning the Cleveland data randomly into 208 (70%) training samples and 89 (30%) test samples over 200 replications. Preliminary results showed that some of the bio-clinical categorical variables are strongly associated with the heart disease conditions of the patients (p < 0.001). The classification results from the test samples indicated that the support vector machine yielded the best predictive performances with 85% accuracy, 82% sensitivity, 88% specificity, 87% precision, 91% area under the ROC curve, and 38% log loss value. These results were validated on the Statlog data in tenfold cross-validation which were all consistent with those obtained from the Cleveland dataset.
引用
收藏
页码:129 / 144
页数:15
相关论文
共 42 条
  • [1] Nahar J(2013)Computational intelligence for heart disease diagnosis: a medical knowledge driven approach Expert Syst Appl 40 96-104
  • [2] Imam T(2014)The epidemic of the 20th century: coronary heart disease Am J Med 127 807-812
  • [3] Tickle KS(2019)Effective heart disease prediction using hybrid machine learning techniques IEEE Access 7 81542-81554
  • [4] Chen Y-PP(2018)Prediction system for heart disease using Naive Bayes and particle swarm optimization Biomed Res 4 61-64
  • [5] Dalen JE(2013)Predict the diagnosis of heart disease patients using classification mining techniques IOSR J Agric Vet Sci (IOSR-JAVS) 6 53-61
  • [6] Alpert JS(2014)Microarray-based classification of histopathologic responses of locally advanced rectal carcinomas to neoadjuvant radiochemotherapy treatment Turk Klinikleri J Biostat 16 100203-188
  • [7] Goldberg RJ(2019)Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques Inform Med Unlock 8 173-1232
  • [8] Weinstein RS(2019)An empirical demonstration of the no free lunch theorem Math Appl 27 130-32
  • [9] Mohan S(2015)Decision tree methods: applications for classification and prediction Shanghai Arch Psychiatry 29 1189-22
  • [10] Thirumalai C(2001)Greedy function approximation: a gradient boosting machine Ann Stat 4 71-58