Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults

被引:19
|
作者
Wu, Yang [1 ,2 ,3 ]
Hu, Haofei [3 ,4 ,5 ]
Cai, Jinlin [1 ,2 ,6 ]
Chen, Runtian [1 ,2 ,3 ]
Zuo, Xin [7 ]
Cheng, Heng [7 ]
Yan, Dewen [1 ,2 ,3 ]
机构
[1] Shenzhen Univ, Affiliated Hosp 1, Dept Endocrinol, Shenzhen, Peoples R China
[2] Shenzhen Second Peoples Hosp, Dept Endocrinol, Shenzhen, Peoples R China
[3] Shenzhen Univ, Hlth Sci Ctr, Shenzhen, Peoples R China
[4] Shenzhen Univ, Affiliated Hosp 1, Dept Nephrol, Shenzhen, Peoples R China
[5] Shenzhen Second Peoples Hosp, Dept Nephrol, Shenzhen, Peoples R China
[6] Shantou Univ, Med Coll, Shantou, Peoples R China
[7] Third Peoples Hosp Shenzhen, Dept Endocrinol, Shenzhen, Peoples R China
关键词
machine learning; extreme gradient boosting; simple stepwise model; Incident diabetes; risk; TYPE-2; MELLITUS; MODELS; COMPLICATIONS; NOMOGRAM; TRENDS; IMPACT; BMI;
D O I
10.3389/fpubh.2021.626331
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.
引用
收藏
页数:12
相关论文
共 50 条
  • [11] Changes in sleep duration and 3-year risk of mild cognitive impairment in Chinese older adults
    Zhu, Qi
    Fan, Hui
    Zhang, Xiaoning
    Ji, Chao
    Xia, Yang
    AGING-US, 2020, 12 (01): : 309 - 317
  • [12] Machine learning for the prediction of atherosclerotic cardiovascular disease during 3-year follow up in Chinese type 2 diabetes mellitus patients
    Ding, Jinru
    Luo, Yingying
    Shi, Huwei
    Chen, Ruiyao
    Luo, Shuqing
    Yang, Xu
    Xiao, Zhongzhou
    Liang, Bilin
    Yan, Qiujuan
    Xu, Jie
    Ji, Linong
    JOURNAL OF DIABETES INVESTIGATION, 2023, 14 (11) : 1289 - 1302
  • [13] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
    Gorgan-Mohammadi, Faezeh
    Rajaee, Taher
    Zounemat-Kermani, Mohammad
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2023, 30 (23) : 63839 - 63863
  • [14] Construction of a 3-year risk prediction model for developing diabetes in patients with pre-diabetes
    Yang, Jianshu
    Liu, Dan
    Du, Qiaoqiao
    Zhu, Jing
    Lu, Li
    Wu, Zhengyan
    Zhang, Daiyi
    Ji, Xiaodong
    Zheng, Xiang
    FRONTIERS IN ENDOCRINOLOGY, 2024, 15
  • [15] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
    Faezeh Gorgan-Mohammadi
    Taher Rajaee
    Mohammad Zounemat-Kermani
    Environmental Science and Pollution Research, 2023, 30 : 63839 - 63863
  • [16] Effects of bariatric surgery in Chinese with obesity and type 2 diabetes mellitus A 3-year follow-up
    Zuo, Didi
    Xiao, Xianchao
    Yang, Shuo
    Gao, Yuan
    Wang, Guixia
    Ning, Guang
    MEDICINE, 2020, 99 (34) : E21673
  • [17] Diabetes Care and Dementia Among Older Adults: A Nationwide 3-Year Longitudinal Study
    Wargny, Matthieu
    Gallini, Adeline
    Hanaire, Helene
    Nourhashemi, Fati
    Andrieu, Sandrine
    Gardette, Virginie
    JOURNAL OF THE AMERICAN MEDICAL DIRECTORS ASSOCIATION, 2018, 19 (07) : 601 - +
  • [18] Identification of patients at risk for pancreatic cancer in a 3-year timeframe based on machine learning algorithms
    Zhu, Weicheng
    Chen, Long
    Aphinyanaphongs, Yindalon
    Kastrinos, Fay
    Simeone, Diane M.
    Pochapin, Mark
    Stender, Cody
    Razavian, Narges
    Gonda, Tamas A.
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [19] Predicting the Risk of Diabetes and Heart Disease with Machine Learning Classifiers: The Mediation Analysis
    Verma, Ajay
    Jain, Manisha
    MEASUREMENT-INTERDISCIPLINARY RESEARCH AND PERSPECTIVES, 2024,
  • [20] Machine Learning for Predicting the Risk of Transition from Prediabetes to Diabetes
    Zueger, Thomas
    Schallmoser, Simon
    Kraus, Mathias
    Saar-Tsechansky, Maytal
    Feuerriegel, Stefan
    Stettler, Christoph
    DIABETES TECHNOLOGY & THERAPEUTICS, 2022, 24 (11) : 842 - 847