A machine learning-based diabetes risk prediction modeling study

被引:0
作者
Ming, Jiexiu [1 ]
Xu, Junyi [1 ]
Zhang, Miaomiao [1 ]
Li, Ningyu [1 ]
Yan, Xu [2 ]
机构
[1] Wuhan Donghu Univ, Wuhan 430212, Hubei, Peoples R China
[2] Wuhan Inst Technol, Univ Hosp, Wuhan 430205, Hubei, Peoples R China
来源
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024 | 2024年
关键词
Factor analysis; Machine learning; Bayesian optimization; Support vector machine regression; SVR;
D O I
10.1145/3675249.3675313
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetes mellitus is a chronic metabolic disease, mainly characterized by insufficient insulin secretion or impaired insulin action in the body, resulting in elevated blood glucose. According to the World Health Organization (WHO), the number of diabetes patients worldwide has been on the rise in recent years, and has become an important public health problem worldwide today. In this paper, we used the Random Forest-based feature importance screening method to retain the variables with larger variable feature weights, performed Spearman correlation analysis, selected the top 10 operational variables with lower correlations, and used information entropy theory and correlation analysis to test the representativeness and independence of the main variables, and finally screened out the main variables as platelet volume distribution width, HDL cholesterol, and the proportion of white globules, platelet specific volume, platelet count, red blood cell count, lymphocyte %, albumin, neutrophil %, and leukocyte count. Blood glucose prediction models were established through data mining techniques, in this paper five machine learning were selected for prediction, namely Extreme Gradient Boosted Tree (XGBoost), Random Forest Regression, Support Vector Machine Regression SVR, LightGBM, Gradient Boosted Decision Tree (GBDT). The training set was put into each model for training, and the test set was inputted into the model to get the root mean squared error produced by the five models ( MSE), Mean Absolute Error (MAE), and Maximum Absolute Error (MAS), comparing the five models, in general, the Support Vector Machine regression SVR has the highest accuracy. To establish a support vector machine SVR blood sugar prediction model based on Bayesian optimization, the sample data are normalized, the parameters are initially corrected using Bayesian principles, and then the support vector machine estimation algorithm is selected to initialize the model, the parameters are inferred using the Bayesian evidence framework, and the optimal model is established after several iterations, and the support vector machine regression SVR trained using the optimal hyperparameters obtained from Bayesian optimization model has improved accuracy in all three evaluation metrics.
引用
收藏
页码:363 / 369
页数:7
相关论文
共 16 条
[1]  
Chen Zixiao, 2022, Research on Diabetes Risk Prediction Based on Machine Learning and Integrated Algorithm, DOI [10.27157/d.cnki.ghzku.2022.005162, DOI 10.27157/D.CNKI.GHZKU.2022.005162]
[2]  
Cheng Xu, 2023, Research on the prediction of diabetic complication risk based on machine learning, DOI [10.27623/d.cnki.gzkyu.2023.001760, DOI 10.27623/D.CNKI.GZKYU.2023.001760]
[3]   Prediction of Type 2 Diabetes Based on Machine Learning Algorithm [J].
Deberneh, Henock M. ;
Kim, Intaek .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (06)
[4]  
Evita Rostoka, 2023, Analytical methods: advancing methods and applications, V15
[5]  
HASAN M, 2023, Early Diabetes Symptom Rule Mining Based on Most Important Features and Application of Supervised Machine Learning and CrossValidation Methods to Predict Early Diabetes, DOI [10.27739/d.cnki.gjsgy.2023.000207, DOI 10.27739/D.CNKI.GJSGY.2023.000207]
[6]   A review on current advances in machine learning based diabetes prediction [J].
Jaiswal, Varun ;
Negi, Anjli ;
Pal, Tarun .
PRIMARY CARE DIABETES, 2021, 15 (03) :435-443
[7]  
Jingyu Xue, 2020, Journal of Physics: Conference Series, V1684, DOI 10.1088/1742-6596/1684/1/012062
[8]   Predictive Analysis and Prognostic Approach of Diabetes Prediction with Machine Learning Techniques [J].
Omana, J. ;
Moorthi, M. .
WIRELESS PERSONAL COMMUNICATIONS, 2022, 127 (01) :465-478
[9]  
Qi He, 2019, Journal of Guizhou University (Natural Science Edition), V36, P65, DOI [10.15958/j.cnki.gdxbzrb.2019.02.13, DOI 10.15958/J.CNKI.GDXBZRB.2019.02.13]
[10]   A Hybrid Machine-Learning Model Based on Global and Local Learner Algorithms for Diabetes Mellitus Prediction [J].
Rufo, Derara Duba ;
Debelee, Taye Girma ;
Negera, Worku Gachena .
JOURNAL OF BIOMIMETICS BIOMATERIALS AND BIOMEDICAL ENGINEERING, 2021, 54 :65-88