Scalable and accurate deep learning with electronic health records

被引：1320

作者：

Rajkomar, Alvin ^{[1
,2
]}

Oren, Eyal ^{[1
]}

Chen, Kai ^{[1
]}

Dai, Andrew M. ^{[1
]}

Hajaj, Nissan ^{[1
]}

Hardt, Michaela ^{[1
]}

Liu, Peter J. ^{[1
]}

Liu, Xiaobing ^{[1
]}

Marcus, Jake ^{[1
]}

Sun, Mimi ^{[1
]}

Sundberg, Patrik ^{[1
]}

Yee, Hector ^{[1
]}

Zhang, Kun ^{[1
]}

Zhang, Yi ^{[1
]}

Flores, Gerardo ^{[1
]}

Duggan, Gavin E. ^{[1
]}

Irvine, Jamie ^{[1
]}

Quoc Le ^{[1
]}

Litsch, Kurt ^{[1
]}

Mossin, Alexander ^{[1
]}

Tansuwan, Justin ^{[1
]}

Wang, De ^{[1
]}

Wexler, James ^{[1
]}

Wilson, Jimbo ^{[1
]}

Ludwig, Dana ^{[2
]}

Volchenboum, Samuel L. ^{[3
]}

Chou, Katherine ^{[1
]}

Pearson, Michael ^{[1
]}

Madabushi, Srinivasan ^{[1
]}

Shah, Nigam H. ^{[4
]}

Butte, Atul J. ^{[2
]}

Howell, Michael D. ^{[1
]}

Cui, Claire ^{[1
]}

Corrado, Greg S. ^{[1
]}

Dean, Jeffrey ^{[1
]}

机构：

[1] Google Inc, Mountain View, CA 94043 USA

[2] Univ Calif San Francisco, San Francisco, CA 94143 USA

[3] Univ Chicago Med, Chicago, IL USA

[4] Stanford Univ, Stanford, CA 94305 USA

来源：

NPJ DIGITAL MEDICINE | 2018年 / 1卷

关键词：

RISK PREDICTION MODELS; EARLY WARNING SCORE; BIG DATA; HOSPITAL READMISSION; MEDICAL-RECORDS; VALIDATION; CARE; INPATIENT; ANALYTICS; PATIENT;

D O I：

10.1038/s41746-018-0029-1

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.

引用

页数：10

共 50 条

[21] Machine learning approaches for electronic health records phenotyping: a methodical review
Yang, Siyue
Varghese, Paul
Stephenson, Ellen
Tu, Karen
Gronsbell, Jessica
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2023, 30 (02) : 367 - 381
[22] Machine learning model to predict mental health crises from electronic health records
Garriga, Roger
Mas, Javier
Abraha, Semhar
Nolan, Jon
Harrison, Oliver
Tadros, George
Matic, Aleksandar
NATURE MEDICINE, 2022, 28 (06) : 1240 - +
[23] Textual analysis and visualization of research trends in data mining for electronic health records
Chen, Jingfeng
Wei, Wei
Guo, Chonghui
Tang, Lin
Sun, Leilei
HEALTH POLICY AND TECHNOLOGY, 2017, 6 (04) : 389 - 400
[24] A rapid review of gender, sex, and sexual orientation documentation in electronic health records
Lau, Francis
Antonio, Marcy
Davison, Kelly
Queen, Roz
Devor, Aaron
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (11) : 1774 - 1783
[25] Electronic Health Records in Hospitals
Lipschutz, Josh H.
NEW ENGLAND JOURNAL OF MEDICINE, 2009, 361 (04) : 421 - 421
[26] Predicting age by mining electronic medical records with deep learning characterizes differences between chronological and physiological age
Wang, Zichen
Li, Li
Glicksberg, Benjamin S.
Israel, Ariel
Dudley, Joel T.
Ma'ayan, Avi
JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 76 : 59 - 68
[27] The challenges in making electronic health records accessible to patients
Beard, Leslie
Schein, Rebecca
Morra, Dante
Wilson, Kumanan
Keelan, Jennifer
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2012, 19 (01) : 116 - 120
[28] Training providers: beyond the basics of electronic health records
Bredfeldt, Christine E.
Awad, Elias Bruce
Joseph, Kenneth
Snyder, Mark H.
BMC HEALTH SERVICES RESEARCH, 2013, 13
[29] Ethical questions must be considered for electronic health records
Spriggs, Merle
Arnold, Michael V.
Pearce, Christopher M.
Fry, Craig
JOURNAL OF MEDICAL ETHICS, 2012, 38 (09) : 535 - 539
[30] Quantification of abdominal fat from computed tomography using deep learning and its association with electronic health records in an academic biobank
MacLean, Matthew T.
Jehangir, Qasim
Vujkovic, Marijana
Ko, Yi-An
Litt, Harold
Borthakur, Arijitt
Sagreiya, Hersh
Rosen, Mark
Mankoff, David A.
Schnall, Mitchell D.
Shou, Haochang
Chirinos, Julio
Damrauer, Scott M.
Torigian, Drew A.
Carr, Rotonya
Rader, Daniel J.
Witschey, Walter R.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (06) : 1178 - 1187

← 1 2 3 4 5 →