Scalable and accurate deep learning with electronic health records

被引:1320
作者
Rajkomar, Alvin [1 ,2 ]
Oren, Eyal [1 ]
Chen, Kai [1 ]
Dai, Andrew M. [1 ]
Hajaj, Nissan [1 ]
Hardt, Michaela [1 ]
Liu, Peter J. [1 ]
Liu, Xiaobing [1 ]
Marcus, Jake [1 ]
Sun, Mimi [1 ]
Sundberg, Patrik [1 ]
Yee, Hector [1 ]
Zhang, Kun [1 ]
Zhang, Yi [1 ]
Flores, Gerardo [1 ]
Duggan, Gavin E. [1 ]
Irvine, Jamie [1 ]
Quoc Le [1 ]
Litsch, Kurt [1 ]
Mossin, Alexander [1 ]
Tansuwan, Justin [1 ]
Wang, De [1 ]
Wexler, James [1 ]
Wilson, Jimbo [1 ]
Ludwig, Dana [2 ]
Volchenboum, Samuel L. [3 ]
Chou, Katherine [1 ]
Pearson, Michael [1 ]
Madabushi, Srinivasan [1 ]
Shah, Nigam H. [4 ]
Butte, Atul J. [2 ]
Howell, Michael D. [1 ]
Cui, Claire [1 ]
Corrado, Greg S. [1 ]
Dean, Jeffrey [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Univ Calif San Francisco, San Francisco, CA 94143 USA
[3] Univ Chicago Med, Chicago, IL USA
[4] Stanford Univ, Stanford, CA 94305 USA
来源
NPJ DIGITAL MEDICINE | 2018年 / 1卷
关键词
RISK PREDICTION MODELS; EARLY WARNING SCORE; BIG DATA; HOSPITAL READMISSION; MEDICAL-RECORDS; VALIDATION; CARE; INPATIENT; ANALYTICS; PATIENT;
D O I
10.1038/s41746-018-0029-1
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Machine learning approaches for electronic health records phenotyping: a methodical review
    Yang, Siyue
    Varghese, Paul
    Stephenson, Ellen
    Tu, Karen
    Gronsbell, Jessica
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2023, 30 (02) : 367 - 381
  • [22] Machine learning model to predict mental health crises from electronic health records
    Garriga, Roger
    Mas, Javier
    Abraha, Semhar
    Nolan, Jon
    Harrison, Oliver
    Tadros, George
    Matic, Aleksandar
    NATURE MEDICINE, 2022, 28 (06) : 1240 - +
  • [23] Textual analysis and visualization of research trends in data mining for electronic health records
    Chen, Jingfeng
    Wei, Wei
    Guo, Chonghui
    Tang, Lin
    Sun, Leilei
    HEALTH POLICY AND TECHNOLOGY, 2017, 6 (04) : 389 - 400
  • [24] A rapid review of gender, sex, and sexual orientation documentation in electronic health records
    Lau, Francis
    Antonio, Marcy
    Davison, Kelly
    Queen, Roz
    Devor, Aaron
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (11) : 1774 - 1783
  • [25] Electronic Health Records in Hospitals
    Lipschutz, Josh H.
    NEW ENGLAND JOURNAL OF MEDICINE, 2009, 361 (04) : 421 - 421
  • [26] Predicting age by mining electronic medical records with deep learning characterizes differences between chronological and physiological age
    Wang, Zichen
    Li, Li
    Glicksberg, Benjamin S.
    Israel, Ariel
    Dudley, Joel T.
    Ma'ayan, Avi
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 76 : 59 - 68
  • [27] The challenges in making electronic health records accessible to patients
    Beard, Leslie
    Schein, Rebecca
    Morra, Dante
    Wilson, Kumanan
    Keelan, Jennifer
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2012, 19 (01) : 116 - 120
  • [28] Training providers: beyond the basics of electronic health records
    Bredfeldt, Christine E.
    Awad, Elias Bruce
    Joseph, Kenneth
    Snyder, Mark H.
    BMC HEALTH SERVICES RESEARCH, 2013, 13
  • [29] Ethical questions must be considered for electronic health records
    Spriggs, Merle
    Arnold, Michael V.
    Pearce, Christopher M.
    Fry, Craig
    JOURNAL OF MEDICAL ETHICS, 2012, 38 (09) : 535 - 539
  • [30] Quantification of abdominal fat from computed tomography using deep learning and its association with electronic health records in an academic biobank
    MacLean, Matthew T.
    Jehangir, Qasim
    Vujkovic, Marijana
    Ko, Yi-An
    Litt, Harold
    Borthakur, Arijitt
    Sagreiya, Hersh
    Rosen, Mark
    Mankoff, David A.
    Schnall, Mitchell D.
    Shou, Haochang
    Chirinos, Julio
    Damrauer, Scott M.
    Torigian, Drew A.
    Carr, Rotonya
    Rader, Daniel J.
    Witschey, Walter R.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (06) : 1178 - 1187