Development and validation of interpretable machine learning models to predict glomerular filtration rate in chronic kidney disease Colombian patients

被引:0
作者
Rojas, Luis H. [1 ]
Pereira-Morales, Angela J. [1 ]
Amador, William [1 ]
Montenegro, Albert [1 ]
Buelvas, Walberto [2 ]
de la Espriella, Victor [2 ]
机构
[1] Sci Life S4L SAS, Calle 11c 73-52, Bogota 110821, Colombia
[2] Medisinu IPS, Monteria, Colombia
关键词
Machine learning; chronic kidney disease; extreme gradient boosting; risk prediction; VARIABILITY; BURDEN; RISK;
D O I
10.1177/00045632241285528
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: ML predictive models have shown their capability to improve risk prediction and assist medical decision-making, nevertheless, there is a lack of accuracy systems to early identify future rapid CKD progressors in Colombia and even in South America. Objective: The purpose of this study was to develop a series of interpretable machine learning models that predict GFR at 6-months, 9-months, and 12-months.Study Design and SettingOver 29,000 CKD patients stage 1 to 3b (estimated GFR, <60 mL/min/1.73 m(2)) with an average of 3-year follow-up data were included. We used the machine learning extreme gradient boosting (XGBoost) to build three models to predict the next eGFR. Models were internally and externally validated. In addition, we included SHapley Additive exPlanation (SHAP) values to offer interpretable global and local prediction models. Results: All models showed a good performance in development and external validation. However, the 6-months XGBoost prediction model showed the best performance in internal (MAE average = 6.07; RSME = 78.87), and in external validation (MAE average = 6.45, RSME = 18.94). The top 3 most influential features that pushed the predicted eGFR value to lower values were the interpolated values for eGFR and creatinine, and eGFR at baseline. Conclusion: In the current study we have developed and validated machine learning models to predict the next eGFR value at different intervals. Furthermore, we attempted to approach the need for prediction explanation by offering transparent predictions.
引用
收藏
页码:57 / 66
页数:10
相关论文
共 35 条
  • [1] Ahmad MA, 2018, ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, P559, DOI [10.1145/3233547.3233667, 10.1109/ICHI.2018.00095]
  • [2] Risk prediction in chronic kidney disease
    Ali, Ibrahim
    Kalra, Philip
    [J]. CURRENT OPINION IN NEPHROLOGY AND HYPERTENSION, 2019, 28 (06) : 513 - 518
  • [3] Ali Z A., 2023, Acad. J. Nawroz Univ, V12, P320, DOI DOI 10.25007/AJNU.V12N2A1612
  • [4] Big Data and Machine Learning in Health Care
    Beam, Andrew L.
    Kohane, Isaac S.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 319 (13): : 1317 - 1318
  • [5] Global, regional, and national burden of chronic kidney disease, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017
    Bikbov, Boris
    Purcell, Carrie
    Levey, Andrew S.
    Smith, Mari
    Abdoli, Amir
    Abebe, Molla
    Adebayo, Oladimeji M.
    Afarideh, Mohsen
    Agarwal, Sanjay Kumar
    Agudelo-Botero, Marcela
    Ahmadian, Elham
    Al-Aly, Ziyad
    Alipour, Vahid
    Almasi-Hashiani, Amir
    Al-Raddadi, Rajaa M.
    Alvis-Guzman, Nelson
    Amini, Saeed
    Andrei, Tudorel
    Andrei, Catalina Liliana
    Andualem, Zewudu
    Anjomshoa, Mina
    Arabloo, Jalal
    Ashagre, Alebachew Fasil
    Asmelash, Daniel
    Ataro, Zerihun
    Atout, Maha Moh'd Wahbi
    Ayanore, Martin Amogre
    Badawi, Alaa
    Bakhtiari, Ahad
    Ballew, Shoshana H.
    Balouchi, Abbas
    Banach, Maciej
    Barquera, Simon
    Basu, Sanjay
    Bayih, Mulat Tirfie
    Bedi, Neeraj
    Bello, Aminu K.
    Bensenor, Isabela M.
    Bijani, Ali
    Boloor, Archith
    Borzi, Antonio M.
    Camera, Luis Alberto
    Carrero, Juan J.
    Carvalho, Felix
    Castro, Franz
    Catala-Lopez, Ferran
    Chang, Alex R.
    Chin, Ken Lee
    Chung, Sheng-Chia
    Cirillo, Massimo
    [J]. LANCET, 2020, 395 (10225) : 709 - 733
  • [6] Bowen D, 2020, Arxiv, DOI [arXiv:2006.07155, 10.48550/ARXIV.2006.07155, DOI 10.48550/ARXIV.2006.07155]
  • [7] The need to separate the wheat from the chaff in medical informatics Introducing a comprehensive checklist for the (self)-assessment of medical AI studies
    Cabitza, Federico
    Campagner, Andrea
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 153
  • [8] Chen TQ, 2016, Arxiv, DOI [arXiv:1603.02754, 10.48550/arXiv.1603.02754]
  • [9] Update on the Burden of CKD
    Coresh, Josef
    [J]. JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2017, 28 (04): : 1020 - 1022
  • [10] Futoma J., 2016, PMLR., P42