An optimized XGBoost based diagnostic system for effective prediction of heart disease

被引:129
作者
Budholiya, Kartik [1 ]
Shrivastava, Shailendra Kumar [1 ]
Sharma, Vivek [1 ]
机构
[1] Samrat Ashok Technol Inst, Comp Sci & Engn, Vidisha, Madhya Pradesh, India
关键词
XGBoost; BayesianOptimization; Categoricalfeatureencoding; HeartDisease; Prediction; GAUSSIAN-PROCESSES;
D O I
10.1016/j.jksuci.2020.10.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Researchers have created several expert systems over the years to predict heart disease early and assist cardiologists to enhance the diagnosis process. We present a diagnostic system in this paper that utilizes an optimized XGBoost (Extreme Gradient Boosting) classifier to predict heart disease. Proper hyper-parameter tuning is essential for any classifier's successful application. To optimize the hyper-parameters of XGBoost, we used Bayesian optimization, which is a very efficient method for hyper-parameter optimization. We also used One-Hot (OH) encoding technique to encode categorical features in the dataset to improve prediction accuracy. The efficacy of the proposed model is evaluated on Cleveland heart disease dataset and compared it with Random Forest (RF) and Extra Tree (ET) classifiers. Five different evaluation metrics: accuracy, sensitivity, specificity, F1-score, and AUC (area under the curve) of ROC charts were used for performance evaluation. The experimental results showed its validity and efficacy in the prediction of heart disease. In addition, proposed model displays better performance compared to the previously suggested models. Moreover, our proposed method reaches the high prediction accuracy of 91.8%. Our results indicate that the proposed method could be used reliably to predict heart disease in the clinic. (C) 2020 The Authors. Published by Elsevier B.V. on behalf of King Saud University.
引用
收藏
页码:4514 / 4523
页数:10
相关论文
共 32 条
  • [1] Abushariah M.A.M., 2014, J. Softw. Eng. Appl., V7, P1055
  • [2] Ali L., 2019, OPTIMIZED STACKED SU, DOI [10.1109/ACCESS.2019.2909969, DOI 10.1109/ACCESS.2019.2909969]
  • [3] Diagnosis of Coronary Artery Disease Using Cost-Sensitive Algorithms
    Alizadehsani, Roohallah
    Hosseini, Mohammad Javad
    Sani, Zahra Alizadeh
    Ghandeharioun, Asma
    Boghrati, Reihane
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 9 - 16
  • [4] [Anonymous], XGBOOST PAR XGBOOST
  • [5] [Anonymous], CLEVELAND HEART DIS
  • [6] Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules
    Anooj, P. K.
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2012, 24 (01) : 27 - 40
  • [7] Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm
    Arabasadi, Zeinab
    Alizadehsani, Roohallah
    Roshanzamir, Mohamad
    Moosaei, Hossein
    Yarifard, Ali Asghar
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2017, 141 : 19 - 26
  • [8] A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine
    Babaoglu, Ismail
    Findik, Oguz
    Ulker, Erkan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) : 3177 - 3183
  • [9] Bergstra J.S., 2011, P 25 ANN C NEUR INF
  • [10] Bergstra J, 2012, J MACH LEARN RES, V13, P281