Establishment of a machine learning predictive model for non-alcoholic fatty liver disease: A longitudinal cohort study

被引:2
|
作者
Cao, Tengrui [1 ,2 ]
Zhu, Qian [1 ,2 ,3 ]
Tong, Chao [4 ]
Halengbieke, Aheyeerke [1 ,2 ]
Ni, Xuetong [1 ,2 ]
Tang, Jianmin [1 ,2 ]
Han, Yumei [5 ]
Li, Qiang [5 ]
Yang, Xinghua [1 ,2 ]
机构
[1] Capital Med Univ, Sch Publ Hlth, 10 Xitoutiao, Beijing 100069, Peoples R China
[2] Beijing Municipal Key Lab Clin Epidemiol, 10 Xitoutiao, Beijing 100069, Peoples R China
[3] Chinese Acad Med Sci & Peking Union Med Coll, Natl Canc Ctr, Natl Clin Res Ctr Canc, Canc Hosp,Off Canc Registry, Beijing 100021, Peoples R China
[4] Beijing Ctr Dis Prevent & Control, Beijing 100013, Peoples R China
[5] Beijing Phys Examinat Ctr, Sci & Educ Sect, 59 Beiwei Rd, Beijing 100050, Peoples R China
基金
北京市自然科学基金; 国家重点研发计划;
关键词
Non-alcoholic fatty liver disease; Predictive model; eXtreme gradient boosting; Machine learning; DIAGNOSIS; INDEX; NAFLD; TESTS;
D O I
10.1016/j.numecd.2024.02.004
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background and aims: Non-alcoholic fatty liver disease (NAFLD) is a common chronic liver disease, which lacks effective drug treatments. This study aimed to construct an eXtreme Gradient Boosting (XGBoost) prediction model to identify or evaluate potential NAFLD patients. Methods and results: We conducted a longitudinal study of 22,140 individuals from the Beijing Health Management Cohort. Variable filtering was performed using the least absolute shrinkage and selection operator. Random Over Sampling Examples was used to address imbalanced data. Next, the XGBoost model and the other three machine learning (ML) models were built using balanced data. Finally, the variable importance of the XGBoost model was ranked. Among four ML algorithms, we got that the XGBoost model outperformed the other models with the following results: accuracy of 0.835, sensitivity of 0.835, specificity of 0.834, Youden index of 0.669, precision of 0.831, recall of 0.835, F-1 score of 0.833, and an area under the curve of 0.914. The top five variables with the greatest impact on the onset of NAFLD were aspartate aminotransferase, cardiometabolic index, body mass index, alanine aminotransferase, and triglyceride-glucose index. Conclusion: The predictive model based on the XGBoost algorithm enables early prediction of the onset of NAFLD. Additionally, assessing variable importance provides valuable insights into the prevention and treatment of NAFLD. (c) 2024 The Italian Diabetes Society, the Italian Society for the Study of Atherosclerosis, the Italian Society of Human Nutrition and the Department of Clinical Medicine and Surgery, Federico II University. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:1456 / 1466
页数:11
相关论文
共 50 条
  • [21] Non-alcoholic fatty liver disease and socioeconomic determinants in an Iranian cohort study
    Zahra Sadeghianpour
    Bahman Cheraghian
    Hamid Reza Farshchi
    Mohsen Asadi-Lari
    BMC Gastroenterology, 23
  • [22] Lean non-alcoholic fatty liver disease and development of diabetes: a cohort study
    Sinn, Dong Hyun
    Kang, Danbee
    Choi, Soo Jin
    Paik, Seung Woon
    Guallar, Eliseo
    Cho, Juhee
    Gwak, Geum-Youn
    EUROPEAN JOURNAL OF ENDOCRINOLOGY, 2019, 181 (02) : 185 - 192
  • [23] Pathology of non-alcoholic fatty liver disease
    Bedossa, Pierre
    LIVER INTERNATIONAL, 2017, 37 : 85 - 89
  • [24] Non-Alcoholic Fatty Liver Disease in Cuba
    Castellanos-Fernandez, Marlen, I
    Crespo-Ramirez, Eduardo
    del Valle-Diaz, Sergio
    Barreto-Suarez, Eduardo
    Diaz-Elias, Javier O.
    Santalo-Rodriguez, Lorenzo
    Corrales-Alonso, Sahili
    Morales-Martinez, Ignacio
    Cedeno-Ramirez, Elisa
    Perez-Gonzalez, Teresita
    Gonzalez-Suero, Sila M.
    Ruenes-Domech, Caridad
    Infante-Velazquez, Mirtha
    Borges-Gonzalez, Susana A.
    Elvirez-Gutierrez, Angela
    Lazo-del Vallin, Sacha
    Villa-Jimenez, Oscar M.
    Labrada-Moreno, Liana M.
    MEDICC REVIEW, 2021, 23 (01) : 64 - 71
  • [25] Managing non-alcoholic fatty liver disease
    Ngu, Jing Hieng
    Goh, George Boon Bee
    Poh, Zhongxian
    Soetikno, Roy
    SINGAPORE MEDICAL JOURNAL, 2016, 57 (07) : 368 - 370
  • [26] PEDIATRIC NON-ALCOHOLIC FATTY LIVER DISEASE
    Delvin, Edgard
    Patey, Natasha
    Dubois, Josee
    Henderson, Melanie
    Levy, Emile
    JOURNAL OF MEDICAL BIOCHEMISTRY, 2015, 34 (01) : 3 - 12
  • [27] Pathology of non-alcoholic fatty liver disease
    Cataldo, Ivana
    Sarcognato, Samantha
    Sacchi, Diana
    Cacciatore, Matilde
    Baciorri, Francesca
    Mangia, Alessandra
    Cazzagon, Nora
    Guido, Maria
    PATHOLOGICA, 2021, 113 (03) : 194 - 202
  • [28] Redefining non-alcoholic fatty liver disease to metabolic associated fatty liver disease: Is this plausible?
    Devi, Jalpa
    Raees, Aimun
    Butt, Amna Subhan
    WORLD JOURNAL OF HEPATOLOGY, 2022, 14 (01) : 158 - 167
  • [29] Dietary patterns and risk of non-alcoholic fatty liver disease in adults: A prospective cohort study
    Zhang, Shunming
    Gu, Yeqing
    Bian, Shanshan
    Gorska, Magdalena J.
    Zhang, Qing
    Liu, Li
    Meng, Ge
    Yao, Zhanxin
    Wu, Hongmei
    Wang, Yawen
    Zhang, Tingjing
    Wang, Xuena
    Sun, Shaomei
    Wang, Xing
    Zhou, Ming
    Jia, Qiyu
    Song, Kun
    Qi, Lu
    Niu, Kaijun
    CLINICAL NUTRITION, 2021, 40 (10) : 5373 - 5382
  • [30] Stratifying individuals into non-alcoholic fatty liver disease risk levels using time series machine learning models
    Ben-Assuli, Ofir
    Jacobi, Arie
    Goldman, Orit
    Shenhar-Tsarfaty, Shani
    Rogowski, Ori
    Zeltser, David
    Shapira, Itzhak
    Berliner, Shlomo
    Zelber-Sagi, Shira
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 126