Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes

被引：3

作者：

Henriksson, Aron ^{[1
]}

Pawar, Yash ^{[1
]}

Hedberg, Pontus ^{[2
,3
]}

Naucler, Pontus ^{[2
,3
]}

机构：

[1] Stockholm Univ, Dept Comp & Syst Sci DSV, Kista, Sweden

[2] Karolinska Inst, Dept Med Solna MedS, Div Infect Dis, Stockholm, Sweden

[3] Karolinska Univ Hosp, Dept Infect Dis, Stockholm, Sweden

来源：

ARTIFICIAL INTELLIGENCE IN MEDICINE | 2023年 / 146卷

基金：

瑞典研究理事会;

关键词：

Natural language processing; Machine learning; Language models; Clinical BERT; Multimodal learning; Electronic health records; Outcome prediction; COVID-19;

D O I：

10.1016/j.artmed.2023.102695

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Clinical prediction models tend only to incorporate structured healthcare data, ignoring information recorded in other data modalities, including free-text clinical notes. Here, we demonstrate how multimodal models that effectively leverage both structured and unstructured data can be developed for predicting COVID-19 outcomes. The models are trained end-to-end using a technique we refer to as multimodal fine-tuning, whereby a pre -trained language model is updated based on both structured and unstructured data. The multimodal models are trained and evaluated using a multicenter cohort of COVID-19 patients encompassing all encounters at the emergency department of six hospitals. Experimental results show that multimodal models, leveraging the notion of multimodal fine-tuning and trained to predict (i) 30-day mortality, (ii) safe discharge and (iii) readmission, outperform unimodal models trained using only structured or unstructured healthcare data on all three outcomes. Sensitivity analyses are performed to better understand how well the multimodal models perform on different patient groups, while an ablation study is conducted to investigate the impact of different types of clinical notes on model performance. We argue that multimodal models that make effective use of routinely collected healthcare data to predict COVID-19 outcomes may facilitate patient management and contribute to the effective use of limited healthcare resources.

引用

页数：11

共 28 条

[1] [Anonymous], 2022, P 15 INT JOINT C BIO, V5, P180
[2] Prognostic factors for adverse outcomes in patients with COVID-19: a field-wide systematic review and meta-analysis
Bellou, Vanesa
Tzoulaki, Ioanna
van Smeden, Maarten
Moons, Karel G. M.
Evangelou, Evangelos
Belbasis, Lazaros
[J]. EUROPEAN RESPIRATORY JOURNAL, 2022, 59 (02)
[3] Limitations of Transformers on Clinical Text Classification
Gao, Shang
Alawad, Mohammed
Young, M. Todd
Gounley, John
Schaefferkoetter, Noah
Yoon, Hong Jun
Wu, Xiao-Cheng
Durbin, Eric B.
Doherty, Jennifer
Stroup, Antoinette
Coyle, Linda
Tourassi, Georgia
[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (09) : 3596 - 3607
[4] Recurrent convolutional neural network based multimodal disease risk prediction
Hao, Yixue
Usama, Mohd
Yang, Jun
Hossain, M. Shamim
Ghoneim, Ahmed
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 : 76 - 83
[5] Clinical phenotypes and outcomes of SARS-CoV-2, influenza, RSV and seven other respiratory viruses: a retrospective study using complete hospital data
Hedberg, Pontus
Karlsson Valik, John
van der Werff, Suzanne
Tanushi, Hideyuki
Requena Mendez, Ana
Granath, Fredrik
Bell, Max
Martensson, Johan
Dyrdak, Robert
Hertting, Olof
Farnert, Anna
Ternhag, Anders
Naucler, Pontus
[J]. THORAX, 2022, 77 (02) : 154 - 163
[6] Henriksson A, 2015, 2015 IEEE INT C DAT, P1
[7] Huang KX, 2019, Arxiv, DOI arXiv:1912.11975
[8] Huang KX, 2020, Arxiv, DOI [arXiv:1904.05342, 10.48550/arXiv.1904.05342]
[9] Husmann Severin, 2022, NEURIPS 2022 WORKSH
[10] Jin MQ, 2018, Arxiv, DOI arXiv:1811.12276

← 1 2 3 →