Ensemble machine learning prediction of hyperuricemia based on a prospective health checkup population

被引:1
|
作者
Zhang, Yongsheng [1 ,2 ,3 ,4 ]
Zhang, Li [5 ]
Lv, Haoyue [1 ,2 ,3 ,4 ]
Zhang, Guang [1 ,2 ,3 ,4 ]
机构
[1] Shandong First Med Univ, Affiliated Hosp 1, Hlth Management Ctr, Jinan, Peoples R China
[2] Shandong Prov Qianfoshan Hosp, Jinan, Peoples R China
[3] Shandong First Med Univ, Affiliated Hosp 1, Inst Hlth Management, Jinan, Peoples R China
[4] Shandong First Med Univ, Shandong Engn Lab Hlth Management, Affiliated Hosp 1, Jinan, Peoples R China
[5] Shandong First Med Univ, Jinan Cent Hosp, Dept Pharmacol, Jinan, Peoples R China
关键词
hyperuricemia; prediction model; machine learning; stacking ensemble; risk factors; SEGMENTATION; DIAGNOSIS; ALGORITHM; MODELS;
D O I
10.3389/fphys.2024.1357404
中图分类号
Q4 [生理学];
学科分类号
071003 ;
摘要
Objectives: An accurate prediction model for hyperuricemia (HUA) in adults remain unavailable. This study aimed to develop a stacking ensemble prediction model for HUA to identify high-risk groups and explore risk factors.Methods: A prospective health checkup cohort of 40899 subjects was examined and randomly divided into the training and validation sets with the ratio of 7:3. LASSO regression was employed to screen out important features and then the ROSE sampling was used to handle the imbalanced classes. An ensemble model using stacking strategy was constructed based on three individual models, including support vector machine, decision tree C5.0, and eXtreme gradient boosting. Model validations were conducted using the area under the receiver operating characteristic curve (AUC) and the calibration curve, as well as metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. A model agnostic instance level variable attributions technique (iBreakdown) was used to illustrate the black-box nature of our ensemble model, and to identify contributing risk factors.Results: Fifteen important features were screened out of 23 clinical variables. Our stacking ensemble model with an AUC of 0.854, outperformed the other three models, support vector machine, decision tree C5.0, and eXtreme gradient boosting with AUCs of 0.848, 0.851 and 0.849 respectively. Calibration accuracy as well as other metrics including accuracy, specificity, negative predictive value, and F1 score were also proved our ensemble model's superiority. The contributing risk factors were estimated using six randomly selected subjects, which showed that being female and relatively younger, together with having higher baseline uric acid, body mass index, gamma-glutamyl transpeptidase, total protein, triglycerides, creatinine, and fasting blood glucose can increase the risk of HUA. To further validate our model's applicability in the health checkup population, we used another cohort of 8559 subjects that also showed our ensemble prediction model had favorable performances with an AUC of 0.846.Conclusion: In this study, the stacking ensemble prediction model for HUA was developed, and it outperformed three individual models that compose it (support vector machine, decision tree C5.0, and eXtreme gradient boosting). The contributing risk factors were identified with insightful ideas.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] An Ensemble Feature Selection Approach-Based Machine Learning Classifiers for Prediction of COVID-19 Disease
    Hossen, Md. Jakir
    Ramanathan, Thirumalaimuthu Thirumalaiappan
    Al Mamun, Abdullah
    INTERNATIONAL JOURNAL OF TELEMEDICINE AND APPLICATIONS, 2024, 2024
  • [42] Ensemble-based machine learning models for phase prediction in high entropy alloys
    Mishra, Aayesha
    Kompella, Lakshminarayana
    Sanagavarapu, Lalit Mohan
    Varam, Sreedevi
    COMPUTATIONAL MATERIALS SCIENCE, 2022, 210
  • [43] Machine Learning-Based Prediction for Incident Hypertension Based on Regular Health Checkup Data:Derivation and Validationin 2 Independent Nationwide Cohorts in South Korea and Japan
    Hwang, Seung Ha
    Lee, Hayeon
    Lee, Jun Hyuk
    Lee, Myeongcheol
    Koyanagi, Ai
    Smith, Lee
    Rhee, Sang Youl
    Yon, Dong Keon
    Lee, Jinseok
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [44] Rainfall Prediction Using an Ensemble Machine Learning Model Based on K-Stars
    Tuysuzoglu, Goksu
    Birant, Kokten Ulas
    Birant, Derya
    SUSTAINABILITY, 2023, 15 (07)
  • [45] Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes
    Akula, Ramya
    Nguyen, Ni
    Garibay, Ivan
    2019 IEEE SOUTHEASTCON, 2019,
  • [46] Ensemble based machine learning approach for prediction of glioma and multi-grade classification
    Joshi, Rakesh Chandra
    Mishra, Rashmi
    Gandhi, Puneet
    Pathak, Vinay Kumar
    Burget, Radim
    Dutta, Malay Kishore
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 137
  • [47] Ensemble Learning for Short-Term Traffic Prediction Based on Gradient Boosting Machine
    Yang, Senyan
    Wu, Jianping
    Du, Yiman
    He, Yingqi
    Chen, Xu
    JOURNAL OF SENSORS, 2017, 2017
  • [48] Machine learning based novel ensemble learning framework for electricity operational forecasting
    Weeraddana, Dilusha
    Khoa, Nguyen Lu Dang
    Mahdavi, Nariman
    ELECTRIC POWER SYSTEMS RESEARCH, 2021, 201
  • [49] Tree-based ensemble machine learning models in the prediction of acute respiratory distress syndrome following cardiac surgery: a multicenter cohort study
    Zhang, Hang
    Qian, Dewei
    Zhang, Xiaomiao
    Meng, Peize
    Huang, Weiran
    Gu, Tongtong
    Fan, Yongliang
    Zhang, Yi
    Wang, Yuchen
    Yu, Min
    Yuan, Zhongxiang
    Chen, Xin
    Zhao, Qingnan
    Ruan, Zheng
    JOURNAL OF TRANSLATIONAL MEDICINE, 2024, 22 (01)
  • [50] Ensemble learning-based approach for residential building heating energy prediction and optimization
    Zhang, Jianxin
    Huang, Yao
    Cheng, Hengda
    Chen, Huanxin
    Xing, Lu
    He, Yuxuan
    JOURNAL OF BUILDING ENGINEERING, 2023, 67