A Robust Machine Learning Framework for Diabetes Prediction

被引:0
|
作者
Olisah, Chollette [1 ]
Adeleye, Oluwaseun [2 ]
Smith, Lyndon [1 ]
Smith, Melvyn [1 ]
机构
[1] Univ West England, Ctr Machine Vis, Bristol Robot Lab, Bristol, Avon, England
[2] Baze Univ, Dept Comp Sci, Abuja, Nigeria
来源
PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2021, VOL 2 | 2022年 / 359卷
关键词
Diabetes mellitus; Spearman correlation; Polynomial regression; Random forest; Classification; Machine learning; PIMA Indian; IMPUTATION; TREES;
D O I
10.1007/978-3-030-89880-9_58
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diabetes mellitus is a metabolic disorder characterized by hyperglycemia which results from the inadequacy of the body to secret and responds to insulin. If not properly managed or diagnosed on time, diabetes can pose a risk to vital body organs such as the eyes, kidneys, nerves, heart, and blood vessels and can be life-threatening. From the many years of research in computational diagnosis of diabetes, machine learning has been proven to be a viable solution for the prediction of diabetes. However, the accuracy rate to date suggests that there is still much room for improvement. In this paper, we are proposing a machine learning framework to improve the performance of diabetes prediction with the PIMA Indian dataset. Through analysis, we observe that the main challenges of the dataset, which flaws learning, are feature selection and missing values. For each of these challenges, we propose a working solution that incorporates, Spearman Correlation and polynomial regression from a new perspective. Further, we optimize the random forest classifier by tuning its hyperparameters using grid search and repeated stratified k-fold cross-validation to build a robust random forest model that scales to the prediction problem. Finally, through exhaustive experiments, we demonstrate that our proposed data preparation approaches lead to a robust machine learning framework for the diagnosis of diabetes mellitus with train accuracy, and test-accuracy values that range from 98.96% to 100% and 97.92% to 100%, respectively, which outperforms all the state-of-the-art results. The source code for the proposed machine learning framework is made publicly available.
引用
收藏
页码:775 / 792
页数:18
相关论文
共 50 条
  • [21] Developing machine learning based framework for the network traffic prediction
    Murugesan, G.
    Jaiswal, Rachana
    Kshatri, Sapna Singh
    Bhonsle, Devanand
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2022, 13 (03): : 777 - 784
  • [22] Prediction and diagnosis of future diabetes risk: a machine learning approach
    Birjais, Roshan
    Mourya, Ashish Kumar
    Chauhan, Ritu
    Kaur, Harleen
    SN APPLIED SCIENCES, 2019, 1 (09):
  • [23] A review on current advances in machine learning based diabetes prediction
    Jaiswal, Varun
    Negi, Anjli
    Pal, Tarun
    PRIMARY CARE DIABETES, 2021, 15 (03) : 435 - 443
  • [24] An Overview of Diabetes Mellitus Prediction Through Machine Learning Approaches
    Atif, Mohammad
    Siddiqui, Jamshed
    Talib, Faisal
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 1145 - 1150
  • [25] Diabetes prediction using machine learning and explainable AI techniques
    Tasin, Isfafuzzaman
    Nabil, Tansin Ullah
    Islam, Sanjida
    Khan, Riasat
    HEALTHCARE TECHNOLOGY LETTERS, 2023, 10 (1-2) : 1 - 10
  • [26] Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers
    Hasan, Md. Kamrul
    Alam, Md. Ashraful
    Das, Dola
    Hossain, Eklas
    Hasan, Mahmudul
    IEEE ACCESS, 2020, 8 : 76516 - 76531
  • [27] A survey on diabetes risk prediction using machine learning approaches
    Firdous, Shimoo
    Wagai, Gowher A.
    Sharma, Kalpana
    JOURNAL OF FAMILY MEDICINE AND PRIMARY CARE, 2022, 11 (11) : 6929 - 6934
  • [28] Machine learning and balanced techniques for diabetes prediction
    Narvaez, Liliana
    Reategui, Ruth
    2023 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND SOFTWARE TECHNOLOGIES, ICI2ST 2023, 2023, : 68 - 73
  • [29] Diabetes Prediction using Machine Learning Techniques
    Obulesu, O.
    Suresh, K.
    Ramudu, B. Venkata
    HELIX, 2020, 10 (02): : 136 - 142
  • [30] A comparison of machine learning algorithms for diabetes prediction
    Khanam, Jobeda Jamal
    Foo, Simon Y.
    ICT EXPRESS, 2021, 7 (04): : 432 - 439