A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App

被引:8
作者
El-Sofany, Hosam [1 ,2 ]
El-Seoud, Samir A. [3 ]
Karam, Omar H. [3 ]
Abd El-Latif, Yasser M. [4 ]
Taj-Eddin, Islam A. T. F. [5 ]
机构
[1] King Khalid Univ, Coll Comp Sci, Abha, Saudi Arabia
[2] Cairo Higher Inst Engn Comp Sci & Management, Cairo, Egypt
[3] British Univ Egypt BUE, Fac Informat & Comp Sci, Cairo, Egypt
[4] Ain Shams Univ, Fac Sci, Cairo, Egypt
[5] Assiut Univ, Fac Comp & Informat, Assiut, Egypt
关键词
Adaptive boosting - Decision trees - Logistic regression - Support vector machines;
D O I
10.1155/2024/6688934
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.
引用
收藏
页数:13
相关论文
共 25 条
[21]  
Palimkar Prajyot, 2022, Advanced Computing and Intelligent Technologies: Proceedings of ICACIT 2021. Lecture Notes in Networks and Systems (218), P219, DOI 10.1007/978-981-16-2164-2_19
[22]  
Rajendra P., 2021, Comput. Methods Programs Biomed. Updat., V1, DOI [10.1016/j.cmpbup.2021.100032, DOI 10.1016/J.CMPBUP.2021.100032]
[23]  
Rani KJ., 2020, Int J Sci Res Comput Sci, Eng Inform Technol, DOI [10.32628/cseit206463, DOI 10.32628/CSEIT206463]
[24]   Diabetes prediction using machine learning and explainable AI techniques [J].
Tasin, Isfafuzzaman ;
Nabil, Tansin Ullah ;
Islam, Sanjida ;
Khan, Riasat .
HEALTHCARE TECHNOLOGY LETTERS, 2023, 10 (1-2) :1-10
[25]   Machine Learning Prediction Models for Gestational Diabetes Mellitus: Meta-analysis [J].
Zhang, Zheqing ;
Yang, Luqian ;
Han, Wentao ;
Wu, Yaoyu ;
Zhang, Linhui ;
Gao, Chun ;
Jiang, Kui ;
Liu, Yun ;
Wu, Huiqun .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (03)