Prediction of heart disease and classifiers' sensitivity analysis

被引:61
作者
Almustafa, Khaled Mohamad [1 ]
机构
[1] Prince Sultan Univ, Coll Comp & Informat Syst, Dept Informat Syst, Riyadh, Saudi Arabia
关键词
Heart disease (HD); Prediction; Classification; K-nearest neighbor; Support vector machine (SVM); Decision tree J48; Feature selection; Sensitivity analysis;
D O I
10.1186/s12859-020-03626-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundHeart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classification of the Heart Disease dataset in order to correctly classify and or predict HD cases with minimal attributes. The set contains 76 attributes including the class attribute, for 1025 patients collected from Cleveland, Hungary, Switzerland, and Long Beach, but in this paper, only a subset of 14 attributes are used, and each attribute has a given set value. The algorithms used K- Nearest Neighbor (K-NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient Decent (SGD) and Decision Table (DT) classifiers to show the performance of the selected classifications algorithms to best classify, and or predict, the HD cases.ResultsIt was shown that using different classification algorithms for the classification of the HD dataset gives very promising results in term of the classification accuracy for the K-NN (K=1), Decision tree J48 and JRip classifiers with accuracy of classification of 99.7073, 98.0488 and 97.2683% respectively. A feature extraction method was performed using Classifier Subset Evaluator on the HD dataset, and results show enhanced performance in term of the classification accuracy for K-NN (N=1) and Decision Table classifiers to 100 and 93.8537% respectively after using the selected features by only applying a combination of up to 4 attributes instead of 13 attributes for the predication of the HD cases.ConclusionDifferent classifiers were used and compared to classify the HD dataset, and we concluded the benefit of having a reliable feature selection method for HD disease prediction with using minimal number of attributes instead of having to consider all available ones.
引用
收藏
页数:18
相关论文
共 36 条
[1]   A new machine learning technique for an accurate diagnosis of coronary artery disease [J].
Abdar, Moloud ;
Ksiazek, Wojciech ;
Acharya, U. Rajendra ;
Tan, Ru-San ;
Makarenkov, Vladimir ;
Plawiak, Pawel .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2019, 179
[2]  
Abdullah A.S., 2012, IJCA Proceedings on International Conference in Recent Trends in Computational Methods, Communication and Controls (ICON3C 2012), ICON3C, P22
[3]  
Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P457
[4]  
Al-Milli Nabeel, 2013, Journal of Theoretical and Applied Information Technology, V56, P131
[5]   Hybrid genetic-discretized algorithm to handle data uncertainty in diagnosing stenosis of coronary arteries [J].
Alizadehsani, Roohallah ;
Roshanzamir, Mohamad ;
Abdar, Moloud ;
Beykikhoshk, Adham ;
Khosravi, Abbas ;
Nahavandi, Saeid ;
Plawiak, Pawel ;
Tan, Ru San ;
Acharya, U. Rajendra .
EXPERT SYSTEMS, 2022, 39 (07)
[6]   Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules [J].
Anooj, P. K. .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2012, 24 (01) :27-40
[7]  
Cheng CA, 2017, IEEE ENG MED BIO, P2566, DOI 10.1109/EMBC.2017.8037381
[8]  
Cohen William W., 1995, MACHINE LEARNING P 1, V1995, P115, DOI DOI 10.1016/B978-1-55860-377-6.50023-2
[9]   Prediction of hospitalization due to heart diseases by supervised learning methods [J].
Dai, Wuyang ;
Brisimi, Theodora S. ;
Adams, William G. ;
Mela, Theofanie ;
Saligrama, Venkatesh ;
Paschalidis, Ioannis Ch. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2015, 84 (03) :189-197
[10]  
Durairaj M, 2015, PREDICTION HEART DIS, P235