Ensemble machine learning prediction of hyperuricemia based on a prospective health checkup population

被引:1
|
作者
Zhang, Yongsheng [1 ,2 ,3 ,4 ]
Zhang, Li [5 ]
Lv, Haoyue [1 ,2 ,3 ,4 ]
Zhang, Guang [1 ,2 ,3 ,4 ]
机构
[1] Shandong First Med Univ, Affiliated Hosp 1, Hlth Management Ctr, Jinan, Peoples R China
[2] Shandong Prov Qianfoshan Hosp, Jinan, Peoples R China
[3] Shandong First Med Univ, Affiliated Hosp 1, Inst Hlth Management, Jinan, Peoples R China
[4] Shandong First Med Univ, Shandong Engn Lab Hlth Management, Affiliated Hosp 1, Jinan, Peoples R China
[5] Shandong First Med Univ, Jinan Cent Hosp, Dept Pharmacol, Jinan, Peoples R China
关键词
hyperuricemia; prediction model; machine learning; stacking ensemble; risk factors; SEGMENTATION; DIAGNOSIS; ALGORITHM; MODELS;
D O I
10.3389/fphys.2024.1357404
中图分类号
Q4 [生理学];
学科分类号
071003 ;
摘要
Objectives: An accurate prediction model for hyperuricemia (HUA) in adults remain unavailable. This study aimed to develop a stacking ensemble prediction model for HUA to identify high-risk groups and explore risk factors.Methods: A prospective health checkup cohort of 40899 subjects was examined and randomly divided into the training and validation sets with the ratio of 7:3. LASSO regression was employed to screen out important features and then the ROSE sampling was used to handle the imbalanced classes. An ensemble model using stacking strategy was constructed based on three individual models, including support vector machine, decision tree C5.0, and eXtreme gradient boosting. Model validations were conducted using the area under the receiver operating characteristic curve (AUC) and the calibration curve, as well as metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. A model agnostic instance level variable attributions technique (iBreakdown) was used to illustrate the black-box nature of our ensemble model, and to identify contributing risk factors.Results: Fifteen important features were screened out of 23 clinical variables. Our stacking ensemble model with an AUC of 0.854, outperformed the other three models, support vector machine, decision tree C5.0, and eXtreme gradient boosting with AUCs of 0.848, 0.851 and 0.849 respectively. Calibration accuracy as well as other metrics including accuracy, specificity, negative predictive value, and F1 score were also proved our ensemble model's superiority. The contributing risk factors were estimated using six randomly selected subjects, which showed that being female and relatively younger, together with having higher baseline uric acid, body mass index, gamma-glutamyl transpeptidase, total protein, triglycerides, creatinine, and fasting blood glucose can increase the risk of HUA. To further validate our model's applicability in the health checkup population, we used another cohort of 8559 subjects that also showed our ensemble prediction model had favorable performances with an AUC of 0.846.Conclusion: In this study, the stacking ensemble prediction model for HUA was developed, and it outperformed three individual models that compose it (support vector machine, decision tree C5.0, and eXtreme gradient boosting). The contributing risk factors were identified with insightful ideas.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Machine learning for hospital readmission prediction in pediatric population
    da Silva, Nayara Cristina
    Albertini, Marcelo Keese
    Backes, Andre Ricardo
    das Gracas Pena, Georgia
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
  • [22] Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction
    Akinjole, Abisola
    Shobayo, Olamilekan
    Popoola, Jumoke
    Okoyeigbo, Obinna
    Ogunleye, Bayode
    MATHEMATICS, 2024, 12 (21)
  • [23] Prediction of order parameters based on protein NMR structure ensemble and machine learning
    Wang, Qianqian
    Miao, Zhiwei
    Xiao, Xiongjie
    Zhang, Xu
    Yang, Daiwen
    Jiang, Bin
    Liu, Maili
    JOURNAL OF BIOMOLECULAR NMR, 2024, 78 (02) : 87 - 94
  • [24] Assessment and prediction of regional climate based on a multimodel ensemble machine learning method
    Yinghao Fu
    Haoran Zhuang
    Xiaojing Shen
    Wangcheng Li
    Climate Dynamics, 2023, 61 : 4139 - 4158
  • [25] A Machine Learning Ensemble Classifier for Prediction of Brain Strokes
    Mostafa, Samaa A.
    Elzanfaly, Doaa S.
    Yakoub, Ahmed E.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 258 - 266
  • [26] Obesity Prediction Using Ensemble Machine Learning Approaches
    Jindal, Kapil
    Baliyan, Niyati
    Rana, Prashant Singh
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 355 - 362
  • [27] Prediction of energy content of biomass based on hybrid machine learning ensemble algorithm
    Dodo, Usman Alhaji
    Ashigwuike, Evans Chinemezu
    Emechebea, Jonas Nwachukwu
    Abbac, Sani Isah
    ENERGY NEXUS, 2022, 8
  • [28] Oil Price Prediction Using Ensemble Machine Learning
    Gabralla, Lubna A.
    Jammazi, Rania
    Abraham, Ajith
    2013 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRICAL AND ELECTRONICS ENGINEERING (ICCEEE), 2013, : 674 - 679
  • [29] Application of machine learning ensemble models for rainfall prediction
    Hasan Ahmadi
    Babak Aminnejad
    Hojat Sabatsany
    Acta Geophysica, 2023, 71 : 1775 - 1786
  • [30] Game State Prediction with Ensemble of Machine Learning Techniques
    Woh, Sange-Myeong
    Lee, Jee-Hyong
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 89 - 92