A Robust Machine Learning Framework for Diabetes Prediction

被引:0
|
作者
Olisah, Chollette [1 ]
Adeleye, Oluwaseun [2 ]
Smith, Lyndon [1 ]
Smith, Melvyn [1 ]
机构
[1] Univ West England, Ctr Machine Vis, Bristol Robot Lab, Bristol, Avon, England
[2] Baze Univ, Dept Comp Sci, Abuja, Nigeria
来源
PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2021, VOL 2 | 2022年 / 359卷
关键词
Diabetes mellitus; Spearman correlation; Polynomial regression; Random forest; Classification; Machine learning; PIMA Indian; IMPUTATION; TREES;
D O I
10.1007/978-3-030-89880-9_58
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diabetes mellitus is a metabolic disorder characterized by hyperglycemia which results from the inadequacy of the body to secret and responds to insulin. If not properly managed or diagnosed on time, diabetes can pose a risk to vital body organs such as the eyes, kidneys, nerves, heart, and blood vessels and can be life-threatening. From the many years of research in computational diagnosis of diabetes, machine learning has been proven to be a viable solution for the prediction of diabetes. However, the accuracy rate to date suggests that there is still much room for improvement. In this paper, we are proposing a machine learning framework to improve the performance of diabetes prediction with the PIMA Indian dataset. Through analysis, we observe that the main challenges of the dataset, which flaws learning, are feature selection and missing values. For each of these challenges, we propose a working solution that incorporates, Spearman Correlation and polynomial regression from a new perspective. Further, we optimize the random forest classifier by tuning its hyperparameters using grid search and repeated stratified k-fold cross-validation to build a robust random forest model that scales to the prediction problem. Finally, through exhaustive experiments, we demonstrate that our proposed data preparation approaches lead to a robust machine learning framework for the diagnosis of diabetes mellitus with train accuracy, and test-accuracy values that range from 98.96% to 100% and 97.92% to 100%, respectively, which outperforms all the state-of-the-art results. The source code for the proposed machine learning framework is made publicly available.
引用
收藏
页码:775 / 792
页数:18
相关论文
共 50 条
  • [11] Robust analysis of photovoltaic plants: A framework based on prediction uncertainties by machine learning
    Dehshiri, Seyyed Shahabaddin Hosseini
    Firoozabadi, Bahar
    ENERGY CONVERSION AND MANAGEMENT-X, 2025, 26
  • [12] Diabetes Prediction using Machine Learning Algorithms
    Mujumdar, Aishwarya
    Vaidehi, V.
    2ND INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ADVANCED COMPUTING ICRTAC -DISRUP - TIV INNOVATION , 2019, 2019, 165 : 292 - 299
  • [13] A robust voting approach for diabetes prediction using traditional machine learning techniques
    Atik Mahabub
    SN Applied Sciences, 2019, 1
  • [14] Machine Learning for Diabetes Prediction
    Ahmed, Usman
    Li, Chunxiao
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 16 - 19
  • [15] Comparative Study of Machine Learning Approaches in Diabetes Prediction
    Parameswari, P.
    Rajathi, N.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (11): : 42 - 46
  • [16] A Machine Learning Framework for Volume Prediction
    Onal, Umutcan
    Zafeirakopoulos, Zafeirakis
    ANALYSIS OF EXPERIMENTAL ALGORITHMS, SEA2 2019, 2019, 11544 : 408 - 423
  • [17] Performance Comparison of Machine Learning Models for Diabetes Prediction
    Cihan, Pinar
    Coskun, Hakan
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [18] Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets
    Abousaber, Inam
    Abdallah, Haitham F.
    El-Ghaish, Hany
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
  • [19] Predicting Diabetes Mellitus With Machine Learning Techniques
    Zou, Quan
    Qu, Kaiyang
    Luo, Yamei
    Yin, Dehui
    Ju, Ying
    Tang, Hua
    FRONTIERS IN GENETICS, 2018, 9
  • [20] Machine learning-based reproducible prediction of type 2 diabetes subtypes
    Tanabe, Hayato
    Sato, Masahiro
    Miyake, Akimitsu
    Shimajiri, Yoshinori
    Ojima, Takafumi
    Narita, Akira
    Saito, Haruka
    Tanaka, Kenichi
    Masuzaki, Hiroaki
    Kazama, Junichiro J.
    Katagiri, Hideki
    Tamiya, Gen
    Kawakami, Eiryo
    Shimabukuro, Michio
    DIABETOLOGIA, 2024, 67 (11) : 2446 - 2458