Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records

被引:60
作者
Dong, Zheyi [1 ]
Wang, Qian [1 ]
Ke, Yujing [1 ]
Zhang, Weiguang [1 ]
Hong, Quan [1 ]
Liu, Chao [1 ]
Liu, Xiaomin [1 ]
Yang, Jian [1 ]
Xi, Yue [1 ]
Shi, Jinlong [2 ]
Zhang, Li [1 ]
Zheng, Ying [1 ]
Lv, Qiang [1 ]
Wang, Yong [1 ]
Wu, Jie [1 ]
Sun, Xuefeng [1 ]
Cai, Guangyan [1 ]
Qiao, Shen [2 ]
Yin, Chengliang [2 ]
Su, Shibin [2 ]
Chen, Xiangmei [1 ]
机构
[1] Chinese Peoples Liberat Army Gen Hosp, Med Ctr 1, Dept Nephrol,Natl Clin Res Ctr Kidney Dis,Beijing, Nephrol Inst,Chinese Peoples Liberat Army,State K, 28 Fuxing Rd, Beijing 100853, Peoples R China
[2] Army Gen Hosp, Natl Engn Lab Med Big Data Applicat Technol, Med Innovat Res Div Chinese Peoples Liberat, Med Big Data Res Ctr, 28 Fuxing Rd, Beijing 100853, Peoples R China
基金
中国国家自然科学基金;
关键词
Type; 2; diabetes; Diabetic kidney disease; Electronic medical records; Machine learning; Light gradient boosting machine; Risk assessment; GLOMERULAR-FILTRATION-RATE; PROGRESSION; NEPHROPATHY; VALIDATION; DECLINE; ALBUMIN; PEOPLE; LIPIDS; GFR;
D O I
10.1186/s12967-022-03339-1
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Background Established prediction models of Diabetic kidney disease (DKD) are limited to the analysis of clinical research data or general population data and do not consider hospital visits. Construct a 3-year diabetic kidney disease risk prediction model in patients with type 2 diabetes mellitus (T2DM) using machine learning, based on electronic medical records (EMR). Methods Data from 816 patients (585 males) with T2DM and 3 years of follow-up at the PLA General Hospital. 46 medical characteristics that are readily available from EMR were used to develop prediction models based on seven machine learning algorithms (light gradient boosting machine [LightGBM], eXtreme gradient boosting, adaptive boosting, artificial neural network, decision tree, support vector machine, logistic regression). Model performance was evaluated using the area under the receiver operating characteristic curve (AUC). Shapley additive explanation (SHAP) was used to interpret the results of the best performing model. Results The LightGBM model had the highest AUC (0.815, 95% CI 0.747-0.882). Recursive feature elimination with random forest and SHAP plot based on LightGBM showed that older patients with T2DM with high homocysteine (Hcy), poor glycemic control, low serum albumin (ALB), low estimated glomerular filtration rate (eGFR), and high bicarbonate had an increased risk of developing DKD over the next 3 years. Conclusions This study constructed a 3-year DKD risk prediction model in patients with T2DM and normo-albuminuria using machine learning and EMR. The LightGBM model is a tool with potential to facilitate population management strategies for T2DM care in the EMR era.
引用
收藏
页数:10
相关论文
共 51 条
[1]   A Reappraisal of the Risks and Benefits of Treating to Target with Cholesterol Lowering Drugs [J].
Alla, Venkata M. ;
Agrawal, Vrinda ;
DeNazareth, Andrew ;
Mohiuddin, Syed ;
Ravilla, Sudha ;
Rendell, Marc .
DRUGS, 2013, 73 (10) :1025-1054
[2]  
Bonnet F, 2000, DIABETES METAB, V26, P254
[3]   Screening for proteinuria in US adults - A cost-effectiveness analysis [J].
Boulware, LE ;
Jaar, BG ;
Tarver-Carr, ME ;
Brancati, FL ;
Powe, NR .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2003, 290 (23) :3101-3114
[4]   Metabolic acidosis in advanced renal failure:: Differences between diabetic and nondiabetic patients [J].
Caravaca, F ;
Arrobas, M ;
Pizarro, JL ;
Espárrago, JF .
AMERICAN JOURNAL OF KIDNEY DISEASES, 1999, 33 (05) :892-898
[5]  
Chen T., 2016, XGBoost: A Scalable Tree Boosting System|Semantic ScholarEB/OL, V13, P785
[6]   Predicting burn patient mortality with electronic medical records [J].
Cheung, Matthew ;
Cobb, Adrienne N. ;
Kuo, Paul C. .
SURGERY, 2018, 164 (04) :839-847
[7]  
Collins R, 2003, LANCET, V361, P2005
[8]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[9]   Harnessing electronic medical records to advance research on multiple sclerosis [J].
Damotte, Vincent ;
Lizee, Antoine ;
Tremblay, Matthew ;
Agrawal, Alisha ;
Khankhanian, Pouya ;
Santaniello, Adam ;
Gomez, Refujia ;
Lincoln, Robin ;
Tang, Wendy ;
Chen, Tiffany ;
Lee, Nelson ;
Villoslada, Pablo ;
Hollenbach, Jill A. ;
Bevan, Carolyn D. ;
Graves, Jennifer ;
Bove, Riley ;
Goodin, Douglas S. ;
Green, Ari J. ;
Baranzini, Sergio E. ;
Cree, Bruce A. C. ;
Henry, Roland G. ;
Hauser, Stephen L. ;
Gelfand, Jeffrey M. ;
Gourraud, Pierre-Antoine .
MULTIPLE SCLEROSIS JOURNAL, 2019, 25 (03) :408-418
[10]   Features of 20133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study [J].
Docherty, Annemarie B. ;
Harrison, Ewen M. ;
Green, Christopher A. ;
Hardwick, Hayley E. ;
Pius, Riinu ;
Norman, Lisa ;
Holden, Karl A. ;
Read, Jonathan M. ;
Dondelinger, Frank ;
Carson, Gail ;
Merson, Laura ;
Lee, James ;
Plotkin, Daniel ;
Sigfrid, Louise ;
Halpin, Sophie ;
Jackson, Clare ;
Gamble, Carrol ;
Horby, Peter W. ;
Nguyen-Van-Tam, Jonathan S. ;
Ho, Antonia ;
Russell, Clark D. ;
Dunning, Jake ;
Openshaw, Peter Jm ;
Baillie, J. Kenneth ;
Semple, Malcolm G. .
BMJ-BRITISH MEDICAL JOURNAL, 2020, 369