An ensemble learning approach for diabetes prediction using boosting techniques

被引:11
|
作者
Ganie, Shahid Mohammad [1 ]
Pramanik, Pijush Kanti Dutta [2 ]
Malik, Majid Bashir [3 ]
Mallik, Saurav [4 ]
Qin, Hong [5 ]
机构
[1] Woxsen Univ, AI Res Ctr, Sch Business, Hyderabad, India
[2] Galgotias Univ, Sch Comp Applicat & Technol, Greater Noida, India
[3] Baba Ghulam Shah Badshah Univ, Dept Comp Sci, Rajauri, India
[4] Harvard Univ, Sch Publ Hlth, Dept Environm Hlth, Boston, MA 02138 USA
[5] Univ Tennessee Chattanooga, Coll Engn & Comp Sci, Chattanooga, TN 37403 USA
关键词
diabetes prediction; ensemble learning; XGBoost; CatBoost; LightGBM; AdaBoost; gradient boost;
D O I
10.3389/fgene.2023.1252159
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Introduction: Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years.Methods: To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics.Results: The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model.Discussion: The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] Credit Card Fraud Prediction Using XGBoost: An Ensemble Learning Approach
    Mohbey, Krishna Kumar
    Khan, Mohammad Zubair
    Indian, Ajay
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (02)
  • [12] Prediction of Heart Disease using an Ensemble Learning Approach
    Alshehri G.A.
    Alharbi H.M.
    International Journal of Advanced Computer Science and Applications, 2023, 14 (08) : 1089 - 1097
  • [13] Crop Yield Prediction Using Ensemble Machine Learning Techniques
    P. Kuppan
    V. Vishwa Priya
    SN Computer Science, 5 (8)
  • [14] Early Prediction of Diabetes Using an Ensemble of Machine Learning Models
    Dutta, Aishwariya
    Hasan, Md Kamrul
    Ahmad, Mohiuddin
    Awal, Md Abdul
    Islam, Md Akhtarul
    Masud, Mehedi
    Meshref, Hossam
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (19)
  • [15] Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization
    Asif, Daniyal
    Bibi, Mairaj
    Arif, Muhammad Shoaib
    Mukheimer, Aiman
    ALGORITHMS, 2023, 16 (06)
  • [16] Delamination localization in the composite thin plates using ensemble learning: Bagging and boosting techniques
    Das, O.
    Das, D. B.
    SCIENTIA IRANICA, 2024, 31 (04) : 310 - 329
  • [17] A Probabilistic Approach for Prediction of Drilling Rate Index using Ensemble Learning Technique
    Kamran, Muhammad
    JOURNAL OF MINING AND ENVIRONMENT, 2021, 12 (02): : 327 - 337
  • [18] Prediction of the compressive strength of normal concrete using ensemble machine learning approach
    Sapkota S.C.
    Saha P.
    Das S.
    Meesaraganda L.V.P.
    Asian Journal of Civil Engineering, 2024, 25 (1) : 583 - 596
  • [19] Diabetes Prediction Using Ensemble Perceptron Algorithm
    Mirshahvalad, Roxana
    Zanjani, Nastaran Asadi
    2017 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2017, : 190 - 194
  • [20] Investigating Gender and Age Variability in Diabetes Prediction: A Multi-Model Ensemble Learning Approach
    Jain, Rishi
    Tripathi, Nitin Kumar
    Pant, Millie
    Anutariya, Chutiporn
    Silpasuwanchai, Chaklam
    IEEE ACCESS, 2024, 12 : 71535 - 71554