Clinical notes as prognostic markers of mortality associated with diabetes mellitus following critical care: A retrospective cohort analysis using machine learning and unstructured big data

被引:19
作者
De Silva, Kushan [1 ]
Mathews, Noel [1 ]
Teede, Helena [1 ]
Forbes, Andrew [2 ]
Jonsson, Daniel [3 ,4 ]
Demmer, Ryan T. [5 ,6 ]
Enticott, Joanne [1 ]
机构
[1] Monash Univ, Fac Med Nursing & Hlth Sci, Sch Publ Hlth & Prevent Med, Monash Ctr Hlth Res & Implementat, Locked Bag 29,Level 1,43-51 Kanooka Grove, Clayton, Vic 3168, Australia
[2] Monash Univ, Fac Med Nursing & Hlth Sci, Sch Publ Hlth & Prevent Med, Biostat Unit,Div Res Methodol, Melbourne, Vic 3004, Australia
[3] Malmo Univ, Fac Odontol, Dept Periodontol, S-21119 Malmo, Sweden
[4] Swedish Dent Serv Skane, S-22647 Lund, Sweden
[5] Univ Minnesota, Sch Publ Hlth, Div Epidemiol & Community Hlth, Minneapolis, MN USA
[6] Columbia Univ, Mailman Sch Publ Hlth, New York, NY USA
关键词
Critical care; Diabetes; Electronic health records; LASSO; Machine learning; Mortality; Natural language processing; Prognosis; Text mining; ALL-CAUSE MORTALITY; PREDICTION; SELECTION; MODELS; SYSTEM; COMPLICATIONS; RISK;
D O I
10.1016/j.compbiomed.2021.104305
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Clinical notes are ubiquitous resources offering potential value in optimizing critical care via data mining technologies. Objective: To determine the predictive value of clinical notes as prognostic markers of 1-year all-cause mortality among people with diabetes following critical care. Materials and methods: Mortality of diabetes patients were predicted using three cohorts of clinical text in a critical care database, written by physicians (n = 45253), nurses (159027), and both (n = 204280). Natural language processing was used to pre-process text documents and LASSO-regularized logistic regression models were trained and tested. Confusion matrix metrics of each model were calculated and AUROC estimates between models were compared. All predictive words and corresponding coefficients were extracted. Outcome probability associated with each text document was estimated. Results: Models built on clinical text of physicians, nurses, and the combined cohort predicted mortality with AUROC of 0.996, 0.893, and 0.922, respectively. Predictive performance of the models significantly differed from one another whereas inter-rater reliability ranged from substantial to almost perfect across them. Number of predictive words with non-zero coefficients were 3994, 8159, and 10579, respectively, in the models of physicians, nurses, and the combined cohort. Physicians' and nursing notes, both individually and when combined, strongly predicted 1-year all-cause mortality among people with diabetes following critical care. Conclusion: Clinical notes of physicians and nurses are strong and novel prognostic markers of diabetes-associated mortality in critical care, offering potentially generalizable and scalable applications. Clinical text-derived personalized risk estimates of prognostic outcomes such as mortality could be used to optimize patient care.
引用
收藏
页数:11
相关论文
共 5 条
  • [1] Prediction of in-hospital mortality of Clostriodiodes difficile infection using critical care database: a big data-driven, machine learning approach
    Du, Hao
    Siah, Kewin Tien Ho
    Ru-Yan, Valencia Zhang
    Teh, Readon
    Tan, Christopher Yu En
    Yeung, Wesley
    Scaduto, Christina
    Bolongaita, Sarah
    Cruz, Maria Teresa Kasunuran
    Liu, Mengru
    Lin, Xiaohao
    Tan, Yan Yuan
    Feng, Mengling
    BMJ OPEN GASTROENTEROLOGY, 2021, 8 (01):
  • [2] Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study
    Ye, Yunzhen
    Xiong, Yu
    Zhou, Qiongjie
    Wu, Jiangnan
    Li, Xiaotian
    Xiao, Xirong
    JOURNAL OF DIABETES RESEARCH, 2020, 2020
  • [3] Standard Vocabularies to Improve Machine Learning Model Transferability With Electronic Health Record Data: Retrospective Cohort Study Using Health Care-Associated Infection
    Kiser, Amber C.
    Eilbeck, Karen
    Ferraro, Jeffrey P.
    Skarda, David E.
    Samore, Matthew H.
    Bucher, Brian
    JMIR MEDICAL INFORMATICS, 2022, 10 (08)
  • [4] Using machine learning and big data for the prediction of venous thromboembolic events after spine surgery: A single-center retrospective analysis of multiple models on a cohort of 6869 patients
    Hopkins, Benjamin S.
    Cloney, Michael B.
    Dhillon, Ekamjeet S.
    Texakalidis, Pavlos
    Dallas, Jonathan
    Nguyen, Vincent N.
    Ordon, Matthew
    El Tecle, Najib
    Chen, Thomas C.
    Hsieh, Patrick C.
    Liu, John C.
    Koski, Tyler R.
    Dahdaleh, Nader S.
    JOURNAL OF CRANIOVERTEBRAL JUNCTION AND SPINE, 2023, 14 (03) : 221 - 229
  • [5] Predicting 28-day all-cause mortality in patients admitted to intensive care units with pre-existing chronic heart failure using the stress hyperglycemia ratio: a machine learning-driven retrospective cohort analysis
    Li, Xiao-han
    Yang, Xing-long
    Dong, Bin-bin
    Liu, Qi
    CARDIOVASCULAR DIABETOLOGY, 2025, 24 (01)