A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App

被引:8
作者
El-Sofany, Hosam [1 ,2 ]
El-Seoud, Samir A. [3 ]
Karam, Omar H. [3 ]
Abd El-Latif, Yasser M. [4 ]
Taj-Eddin, Islam A. T. F. [5 ]
机构
[1] King Khalid Univ, Coll Comp Sci, Abha, Saudi Arabia
[2] Cairo Higher Inst Engn Comp Sci & Management, Cairo, Egypt
[3] British Univ Egypt BUE, Fac Informat & Comp Sci, Cairo, Egypt
[4] Ain Shams Univ, Fac Sci, Cairo, Egypt
[5] Assiut Univ, Fac Comp & Informat, Assiut, Egypt
关键词
Adaptive boosting - Decision trees - Logistic regression - Support vector machines;
D O I
10.1155/2024/6688934
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.
引用
收藏
页数:13
相关论文
共 25 条
[1]  
Aada A., 2019, Predicting Diabetes in Medical Datasets Using Machine Learning Techniques, V5, P257
[2]  
Ahmed N., 2021, International Journal of Cognitive Computing in Engineering, V2, P229, DOI [DOI 10.1016/J.IJCCE.2021.12.001, 10.1016/j.ijcce.2021.12.001]
[3]   Prevalence of diabetes and pre-diabetes in Bangladesh: a systematic review and meta-analysis [J].
Akhtar, Sohail ;
Nasir, Jamal Abdul ;
Sarwar, Aqsa ;
Nasr, Nida ;
Javed, Amara ;
Majeed, Rizwana ;
Salam, Muhammad Abdus ;
Billah, Baki .
BMJ OPEN, 2020, 10 (09)
[5]  
Atlas G., 2017, Diabetes
[6]  
Aurelien G., 2021, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
[7]  
Bhola Geetanjali, 2021, Computer Networks and Inventive Communication Technologies. Proceedings of Third ICCNCT 2020. Lecture Notes on Data Engineering and Communications Technologies (LNDECT 58), P131, DOI 10.1007/978-981-15-9647-6_10
[8]   A comprehensive survey on support vector machine classification: Applications, challenges and trends [J].
Cervantes, Jair ;
Garcia-Lamont, Farid ;
Rodriguez-Mazahua, Lisbeth ;
Lopez, Asdrubal .
NEUROCOMPUTING, 2020, 408 :189-215
[9]   Smart home health monitoring system for predicting type 2 diabetes and hypertension [J].
Chatrati, Saiteja Prasad ;
Hossain, Gahangir ;
Goyal, Ayush ;
Bhan, Anupama ;
Bhattacharya, Sayantan ;
Gaurav, Devottam ;
Mishra Tiwari, Sanju .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (03) :862-870
[10]   Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers [J].
Hasan, Md. Kamrul ;
Alam, Md. Ashraful ;
Das, Dola ;
Hossain, Eklas ;
Hasan, Mahmudul .
IEEE ACCESS, 2020, 8 :76516-76531