An ensemble learning approach for diabetes prediction using boosting techniques

被引:11
|
作者
Ganie, Shahid Mohammad [1 ]
Pramanik, Pijush Kanti Dutta [2 ]
Malik, Majid Bashir [3 ]
Mallik, Saurav [4 ]
Qin, Hong [5 ]
机构
[1] Woxsen Univ, AI Res Ctr, Sch Business, Hyderabad, India
[2] Galgotias Univ, Sch Comp Applicat & Technol, Greater Noida, India
[3] Baba Ghulam Shah Badshah Univ, Dept Comp Sci, Rajauri, India
[4] Harvard Univ, Sch Publ Hlth, Dept Environm Hlth, Boston, MA 02138 USA
[5] Univ Tennessee Chattanooga, Coll Engn & Comp Sci, Chattanooga, TN 37403 USA
关键词
diabetes prediction; ensemble learning; XGBoost; CatBoost; LightGBM; AdaBoost; gradient boost;
D O I
10.3389/fgene.2023.1252159
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Introduction: Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years.Methods: To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics.Results: The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model.Discussion: The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] Performance prediction of roadheaders using ensemble machine learning techniques
    Seker, Sadi Evren
    Ocak, Ibrahim
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (04): : 1103 - 1116
  • [12] Prediction of Prostate Cancer using Ensemble of Machine Learning Techniques
    Oyewo, O. A.
    Boyinbode, O. K.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (03) : 149 - 154
  • [13] Crop Yield Prediction Using Ensemble Machine Learning Techniques
    P. Kuppan
    V. Vishwa Priya
    SN Computer Science, 5 (8)
  • [14] Enhancing Flood Prediction using Ensemble and Deep Learning Techniques
    Nti, Isaac Kofi
    Nyarko-Boateng, Owusu
    Boateng, Samuel
    Bawah, F. U.
    Agbedanu, P. R.
    Awarayi, N. S.
    Nimbe, P.
    Adekoya, A. F.
    Weyori, B. A.
    Akoto-Adjepong, Vivian
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 662 - 670
  • [15] Performance prediction of roadheaders using ensemble machine learning techniques
    Sadi Evren Seker
    Ibrahim Ocak
    Neural Computing and Applications, 2019, 31 : 1103 - 1116
  • [16] Employee Attrition Prediction using Nested Ensemble Learning Techniques
    Alshiddy, Muneera Saad
    Aljaber, Bader Nasser
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (07) : 932 - 938
  • [17] Prediction of Heart Disease using an Ensemble Learning Approach
    Alshehri G.A.
    Alharbi H.M.
    International Journal of Advanced Computer Science and Applications, 2023, 14 (08) : 1089 - 1097
  • [18] Ensemble Learning Paradigms for Flow Rate Prediction Boosting
    Kouadio, Kouao Laurent
    Liu, Jianxin
    Kouamelan, Serge Kouamelan
    Liu, Rong
    WATER RESOURCES MANAGEMENT, 2023, 37 (11) : 4413 - 4431
  • [19] Stacked Ensemble-Based Type-2 Diabetes Prediction Using Machine Learning Techniques
    Rahim M.A.
    Hossain M.A.
    Hossain M.N.
    Shin J.
    Yun K.S.
    Annals of Emerging Technologies in Computing, 2023, 7 (01) : 30 - 39
  • [20] Ensemble Learning Paradigms for Flow Rate Prediction Boosting
    Kouao Laurent Kouadio
    Jianxin Liu
    Serge Kouamelan Kouamelan
    Rong Liu
    Water Resources Management, 2023, 37 : 4413 - 4431