Scalable and accurate deep learning with electronic health records

被引:1335
作者
Rajkomar, Alvin [1 ,2 ]
Oren, Eyal [1 ]
Chen, Kai [1 ]
Dai, Andrew M. [1 ]
Hajaj, Nissan [1 ]
Hardt, Michaela [1 ]
Liu, Peter J. [1 ]
Liu, Xiaobing [1 ]
Marcus, Jake [1 ]
Sun, Mimi [1 ]
Sundberg, Patrik [1 ]
Yee, Hector [1 ]
Zhang, Kun [1 ]
Zhang, Yi [1 ]
Flores, Gerardo [1 ]
Duggan, Gavin E. [1 ]
Irvine, Jamie [1 ]
Quoc Le [1 ]
Litsch, Kurt [1 ]
Mossin, Alexander [1 ]
Tansuwan, Justin [1 ]
Wang, De [1 ]
Wexler, James [1 ]
Wilson, Jimbo [1 ]
Ludwig, Dana [2 ]
Volchenboum, Samuel L. [3 ]
Chou, Katherine [1 ]
Pearson, Michael [1 ]
Madabushi, Srinivasan [1 ]
Shah, Nigam H. [4 ]
Butte, Atul J. [2 ]
Howell, Michael D. [1 ]
Cui, Claire [1 ]
Corrado, Greg S. [1 ]
Dean, Jeffrey [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Univ Calif San Francisco, San Francisco, CA 94143 USA
[3] Univ Chicago Med, Chicago, IL USA
[4] Stanford Univ, Stanford, CA 94305 USA
来源
NPJ DIGITAL MEDICINE | 2018年 / 1卷
关键词
RISK PREDICTION MODELS; EARLY WARNING SCORE; BIG DATA; HOSPITAL READMISSION; MEDICAL-RECORDS; VALIDATION; CARE; INPATIENT; ANALYTICS; PATIENT;
D O I
10.1038/s41746-018-0029-1
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Training providers: beyond the basics of electronic health records
    Bredfeldt, Christine E.
    Awad, Elias Bruce
    Joseph, Kenneth
    Snyder, Mark H.
    [J]. BMC HEALTH SERVICES RESEARCH, 2013, 13
  • [32] Ethical questions must be considered for electronic health records
    Spriggs, Merle
    Arnold, Michael V.
    Pearce, Christopher M.
    Fry, Craig
    [J]. JOURNAL OF MEDICAL ETHICS, 2012, 38 (09) : 535 - 539
  • [33] MODELING CONSUMER ACCEPTANCE OF ELECTRONIC PERSONAL HEALTH RECORDS
    Cocosila, Mihail
    Archer, Norm
    [J]. JOURNAL OF ELECTRONIC COMMERCE RESEARCH, 2018, 19 (02): : 119 - 134
  • [34] Survey: Deep Learning Concepts and Techniques for Electronic Health Record
    al-Aiad, Ahmad
    Duwairi, Rehab
    Fraihat, Manar
    [J]. 2018 IEEE/ACS 15TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2018,
  • [35] Electronic health records systems and hospital clinical performance: a study of nationwide hospital data
    Yuan, Neal
    Dudley, R. Adams
    Boscardin, W. John
    Lin, Grace A.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (10) : 999 - 1009
  • [36] Creation of a Multicenter Pediatric Inpatient Data Repository Derived from Electronic Health Records
    Hornik, Christoph P.
    Atz, Andrew M.
    Bendel, Catherine
    Chan, Francis
    Downes, Kevin
    Grundmeier, Robert
    Fogel, Ben
    Gipson, Debbie
    Laughon, Matthew
    Miller, Michael
    Smith, Michael
    Livingston, Chad
    Kluchar, Cindy
    Heath, Anne
    Jarrett, Chanda
    McKerlie, Brian
    Patel, Hetalkumar
    Hunter, Christina
    Furda, Gary
    Benjamin, Danny
    Capparelli, Edmund
    Kearns, Gregory L.
    Paul, Ian M.
    Hornik, Christoph
    Wade, Kelly
    [J]. APPLIED CLINICAL INFORMATICS, 2019, 10 (02): : 307 - 315
  • [37] Machine learning in infection management using routine electronic health records: tools, techniques, and reporting of future technologies
    Luz, C. F.
    Vollmer, M.
    Decruyenaere, J.
    Nijsten, M. W.
    Glasner, C.
    Sinha, B.
    [J]. CLINICAL MICROBIOLOGY AND INFECTION, 2020, 26 (10) : 1291 - 1299
  • [38] Rapid progress or lengthy process? electronic personal health records in mental health
    Ennis, Liam
    Rose, Diana
    Callard, Felicity
    Denis, Mike
    Wykes, Til
    [J]. BMC PSYCHIATRY, 2011, 11
  • [39] The State of Population Health Surveillance Using Electronic Health Records: A Narrative Review
    Paul, Margaret M.
    Greene, Carolyn M.
    Newton-Dame, Remle
    Thorpe, Lorna E.
    Perlman, Sharon E.
    McVeigh, Katherine H.
    Gourevitch, Marc N.
    [J]. POPULATION HEALTH MANAGEMENT, 2015, 18 (03) : 209 - 216
  • [40] Electronic Health Records: Context Matters!
    Ventres, William B.
    Frankel, Richard M.
    [J]. FAMILIES SYSTEMS & HEALTH, 2016, 34 (02) : 163 - 165