Health system-scale language models are all-purpose prediction engines

被引：203

作者：

Jiang, Lavender Yao ^{[1
,2
]}

Liu, Xujin Chris ^{[1
,3
]}

Nejatian, Nima Pour ^{[4
]}

Nasir-Moin, Mustafa ^{[1
]}

Wang, Duo ^{[5
]}

Abidin, Anas ^{[4
]}

Eaton, Kevin ^{[6
]}

Riina, Howard Antony ^{[1
]}

Laufer, Ilya ^{[1
]}

Punjabi, Paawan ^{[6
]}

Miceli, Madeline ^{[6
]}

Kim, Nora C. ^{[1
]}

Orillac, Cordelia ^{[1
]}

Schnurman, Zane ^{[1
]}

Livia, Christopher ^{[1
]}

Weiss, Hannah ^{[1
]}

Kurland, David ^{[1
]}

Neifert, Sean ^{[1
]}

Dastagirzada, Yosef ^{[1
]}

Kondziolka, Douglas ^{[1
]}

Cheung, Alexander T. M. ^{[1
]}

Yang, Grace ^{[1
,2
]}

Cao, Ming ^{[1
,2
]}

Flores, Mona ^{[4
]}

Costa, Anthony B. ^{[4
]}

Aphinyanaphongs, Yindalon ^{[5
,7
]}

Cho, Kyunghyun ^{[2
,8
,9
,10
]}

Oermann, Eric Karl ^{[1
,2
,11
]}

机构：

[1] NYU Langone Hlth, Dept Neurosurg, New York, NY 10016 USA

[2] NYU, Ctr Data Sci, New York, NY 10012 USA

[3] Tandon Sch Engn, Elect & Comp Engn, New York, NY USA

[4] NVIDIA, Santa Clara, CA USA

[5] NYU Langone Hlth, Predict Analyt Unit, New York, NY USA

[6] NYU Langone Hlth, Dept Internal Med, New York, NY USA

[7] NYU Langone Hlth, Dept Populat Hlth, New York, NY USA

[8] Genentech Inc, Prescient Design, New York, NY USA

[9] NYU, Courant Inst Math Sci, New York, NY USA

[10] Canadian Inst Adv Res, Toronto, ON, Canada

[11] NYU Langone Hlth, Dept Radiol, New York, NY 10016 USA

来源：

NATURE | 2023年 / 619卷 / 7969期

关键词：

D O I：

10.1038/s41586-023-06160-y

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment(1-3). Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing(4,5) to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7-94.9%, with an improvement of 5.36-14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.

引用

页码：357 / +

页数：25

共 46 条

[1]

[Anonymous], OPEN MED

[2] The Fast Health Interoperability Resources (FIHR) Standard: Systematic Literature Review of Implementations, Applications, Challenges and Opportunities [J].

Ayaz, Muhammad ;

Pasha, Muhammad F. ;

Alzahrani, Mohammed Y. ;

Budiarto, Rahmat ;

Stiawan, Deris .

JMIR MEDICAL INFORMATICS, 2021, 9 (07)

[3]

Bolton E., 2022, PUBMEDGPT 2 7B

[4]

Brown TB, 2020, ADV NEUR IN, V33

[5]

Caetano Nuno, 2014, 16th International Conference on Enterprise Information Systems (ICEIS 2014). Proceedings, P407

[6]

Center for Disease Control, 2022, WHAT IS C DIFF

[7]

Charlson comorbidity index (CCI), 2022, MD CALC

[8] A NEW METHOD OF CLASSIFYING PROGNOSTIC CO-MORBIDITY IN LONGITUDINAL-STUDIES - DEVELOPMENT AND VALIDATION [J].

CHARLSON, ME ;

POMPEI, P ;

ALES, KL ;

MACKENZIE, CR .

JOURNAL OF CHRONIC DISEASES, 1987, 40 (05) :373-383

[9] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[10]

Child C G, 1964, Major Probl Clin Surg, V1, P1

← 1 2 3 4 5 →