Validation of risk prediction models applied to longitudinal electronic health record data for the prediction of major cardiovascular events in the presence of data shifts

被引:8
|
作者
Li, Yikuan [1 ,2 ]
Salimi-Khorshidi, Gholamreza [1 ,2 ]
Rao, Shishir [1 ,2 ]
Canoy, Dexter [1 ,2 ,3 ]
Hassaine, Abdelaali [1 ,2 ]
Lukasiewicz, Thomas [4 ]
Rahimi, Kazem [1 ,2 ,3 ]
Mamouei, Mohammad [1 ,2 ]
机构
[1] Univ Oxford, Oxford Martin Sch, Deep Med, Hayes House,75 George St, Oxford OX1 2BQ, England
[2] Univ Oxford, Nuffield Dept Womens & Reprod Hlth, Med Sci Div, Oxford, England
[3] Oxford Univ Hosp NHS Fdn Trust, NIHR Oxford Biomed Res Ctr, Oxford, England
[4] Univ Oxford, Dept Comp Sci, Oxford, England
来源
基金
英国科研创新办公室;
关键词
Cardiovascular disease risk; Heart Failure; Stroke; Coronary heart disease; Predictive modelling; Data shifts; PROFILE;
D O I
10.1093/ehjdh/ztac061
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
AimsDeep learning has dominated predictive modelling across different fields, but in medicine it has been met with mixed reception. In clinical practice, simple, statistical models and risk scores continue to inform cardiovascular disease risk predictions. This is due in part to the knowledge gap about how deep learning models perform in practice when they are subject to dynamic data shifts; a key criterion that common internal validation procedures do not address. We evaluated the performance of a novel deep learning model, BEHRT, under data shifts and compared it with several ML-based and established risk models.Methods and resultsUsing linked electronic health records of 1.1 million patients across England aged at least 35 years between 1985 and 2015, we replicated three established statistical models for predicting 5-year risk of incident heart failure, stroke, and coronary heart disease. The results were compared with a widely accepted machine learning model (random forests), and a novel deep learning model (BEHRT). In addition to internal validation, we investigated how data shifts affect model discrimination and calibration. To this end, we tested the models on cohorts from (i) distinct geographical regions; (ii) different periods. Using internal validation, the deep learning models substantially outperformed the best statistical models by 6%, 8%, and 11% in heart failure, stroke, and coronary heart disease, respectively, in terms of the area under the receiver operating characteristic curve.ConclusionThe performance of all models declined as a result of data shifts; despite this, the deep learning models maintained the best performance in all risk prediction tasks. Updating the model with the latest information can improve discrimination but if the prior distribution changes, the model may remain miscalibrated. Graphical AbstractDesign and main results of the model evaluation in the presence of data shift. EHR, electronic health records; HES, hospital episode statistics; HF, heart failure; CHD, coronary heart disease; CPH, COX proportional hazard; ML, machine learning; DL, deep learning; RF, random forest.
引用
收藏
页码:535 / 547
页数:13
相关论文
共 50 条
  • [11] Prediction of Recurrent Atherosclerotic Cardiovascular Disease Risk Using Machine Learning and Electronic Health Record Data
    Sarraju, Ashish
    Ward, Andrew
    Chung, Sukyung
    Li, Jiang
    Scheinker, David
    Rodriguez, Fatima
    CIRCULATION, 2020, 142
  • [12] A SEMIPARAMETRIC METHOD FOR RISK PREDICTION USING INTEGRATED ELECTRONIC HEALTH RECORD DATA
    Hasler, Byjill
    Ma, Yanyuan
    Wei, Yizheng
    Parikh, Ravi
    Chen, Jinbo
    ANNALS OF APPLIED STATISTICS, 2024, 18 (04): : 3318 - 3337
  • [13] Validation of Prediction Models for Critical Care Outcomes Using Natural Language Processing of Electronic Health Record Data
    Marafino, Ben J.
    Park, Miran
    Davies, Jason M.
    Thombley, Robert
    Luft, Harold S.
    Sing, David C.
    Kazi, Dhruv S.
    DeJong, Colette
    Boscardin, W. John
    Dean, Mitzi L.
    Dudley, R. Adams
    JAMA NETWORK OPEN, 2018, 1 (08)
  • [14] Comparison of Machine Learning Models in Prediction of Cardiovascular Disease Using Health Record Data
    Maiga, Jaouja
    Hungilo, Gilbert Gutabaga
    Pranowo
    2019 INTERNATIONAL CONFERENCE ON INFORMATICS, MULTIMEDIA, CYBER AND INFORMATION SYSTEM (ICIMCIS), 2019, : 45 - 48
  • [15] Longitudinal validation of an electronic health record delirium prediction model applied at admission in COVID-19 patients
    Castro, Victor M.
    Hart, Kamber L.
    Sacks, Chana A.
    Murphy, Shawn N.
    Perlis, Roy H.
    McCoy, Thomas H., Jr.
    GENERAL HOSPITAL PSYCHIATRY, 2022, 74 : 9 - 17
  • [16] Targeted Development and Validation of Clinical Prediction Models in Secondary Care Settings: Opportunities and Challenges for Electronic Health Record Data
    van Maurik, I. S.
    Doodeman, H. J.
    Veeger-Nuijens, B. W.
    Mohringer, R. P. M.
    Sudion, D. R.
    Jongbloed, W.
    van Soelen, E.
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [17] Use and Customization of Risk Scores for Predicting Cardiovascular Events Using Electronic Health Record Data
    Wolfson, Julian
    Vock, David M.
    Bandyopadhyay, Sunayan
    Kottke, Thomas
    Vazquez-Benitez, Gabriela
    Johnson, Paul
    Adomavicius, Gediminas
    O'Connor, Patrick J.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2017, 6 (04):
  • [18] Development and validation of an asthma exacerbation prediction model using electronic health record (EHR) data
    Martin, Alfred
    Bauer, Victoria
    Datta, Avisek
    Masi, Christopher
    Mosnaim, Giselle
    Solomonides, Anthony
    Rao, Goutham
    JOURNAL OF ASTHMA, 2020, 57 (12) : 1339 - 1346
  • [19] Development and validation of a dynamic inpatient risk prediction model for clinically significant hypokalemia using electronic health record data
    Li, Yan
    Staley, Benjamin
    Henriksen, Carl
    Xu, Dandan
    Lipori, Gloria
    Winterstein, Almut G.
    AMERICAN JOURNAL OF HEALTH-SYSTEM PHARMACY, 2019, 76 (05) : 301 - 311
  • [20] Postoperative delirium prediction using machine learning models and preoperative electronic health record data
    Andrew Bishara
    Catherine Chiu
    Elizabeth L. Whitlock
    Vanja C. Douglas
    Sei Lee
    Atul J. Butte
    Jacqueline M. Leung
    Anne L. Donovan
    BMC Anesthesiology, 22