M3T-LM: A multi-modal multi-task learning model for jointly predicting patient length of stay and mortality

被引：0

作者：

Chen, Junde ^{[1
]}

Li, Qing ^{[2
]}

Liu, Feng ^{[3
]}

Wen, Yuxin ^{[1
]}

机构：

[1] Dale E. and Sarah Ann Fowler School of Engineering, Chapman University, Orange, 92866, CA

[2] Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, 50011, IA

[3] School of Systems and Enterprises, Stevens Institute of Technology, Hoboken, 07030, NJ

来源：

Computers in Biology and Medicine | 2024年 / 183卷

基金：

美国国家科学基金会;

关键词：

Data-fusion model; Deep learning; Length of stay prediction; Multi-task learning;

D O I：

10.1016/j.compbiomed.2024.109237

中图分类号：

学科分类号：

摘要：

Ensuring accurate predictions of inpatient length of stay (LoS) and mortality rates is essential for enhancing hospital service efficiency, particularly in light of the constraints posed by limited healthcare resources. Integrative analysis of heterogeneous clinic record data from different sources can hold great promise for improving the prognosis and diagnosis level of LoS and mortality. Currently, most existing studies solely focus on single data modality or tend to single-task learning, i.e., training LoS and mortality tasks separately. This limits the utilization of available multi-modal data and prevents the sharing of feature representations that could capture correlations between different tasks, ultimately hindering the model's performance. To address the challenge, this study proposes a novel Multi-Modal Multi-Task learning model, termed as M3T-LM, to integrate clinic records to predict inpatients’ LoS and mortality simultaneously. The M3T-LM framework incorporates multiple data modalities by constructing sub-models tailored to each modality. Specifically, a novel attention-embedded one-dimensional (1D) convolutional neural network (CNN) is designed to handle numerical data. For clinical notes, they are converted into sequence data, and then two long short-term memory (LSTM) networks are exploited to model on textual sequence data. A two-dimensional (2D) CNN architecture, noted as CRXMDL, is designed to extract high-level features from chest X-ray (CXR) images. Subsequently, multiple sub-models are integrated to formulate the M3T-LM to capture the correlations between patient LoS and modality prediction tasks. The efficiency of the proposed method is validated on the MIMIC-IV dataset. The proposed method attained a test MAE of 5.54 for LoS prediction and a test F1 of 0.876 for mortality prediction. The experimental results demonstrate that our approach outperforms state-of-the-art (SOTA) methods in tackling mixed regression and classification tasks. © 2024 Elsevier Ltd

引用

共 46 条

[1] Multi-modal learning for inpatient length of stay prediction
Chen, Junde
Wen, Yuxin
Pokojovy, Michael
Tseng, Tzu-Liang
McCaffrey, Peter
Vo, Alexander
Walser, Eric
Moen, Scott
COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 171
[2] Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Fan, Dinghao
Lu, Hengjie
Xu, Shugong
Cao, Shan
IEEE SENSORS JOURNAL, 2021, 21 (23) : 27026 - 27036
[3] Multi-modal microblog classification via multi-task learning
Sicheng Zhao
Hongxun Yao
Sendong Zhao
Xuesong Jiang
Xiaolei Jiang
Multimedia Tools and Applications, 2016, 75 : 8921 - 8938
[4] Multi-modal Sentiment and Emotion Joint Analysis with a Deep Attentive Multi-task Learning Model
Zhang, Yazhou
Rong, Lu
Li, Xiang
Chen, Rui
ADVANCES IN INFORMATION RETRIEVAL, PT I, 2022, 13185 : 518 - 532
[5] Multi-modal microblog classification via multi-task learning
Zhao, Sicheng
Yao, Hongxun
Zhao, Sendong
Jiang, Xuesong
Jiang, Xiaolei
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 8921 - 8938
[6] A Multi-modal Sentiment Recognition Method Based on Multi-task Learning
Lin Z.
Long Y.
Du J.
Xu R.
Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57 (01): : 7 - 15
[7] Jointly Predicting Arousal, Valence and Dominance with Multi-Task Learning
Parthasarathy, Srinivas
Busso, Carlos
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1103 - 1107
[8] VEMO: A Versatile Elastic Multi-modal Model for Search-Oriented Multi-task Learning
Fei, Nanyi
Jiang, Hao
Lu, Haoyu
Long, Jinqiang
Dai, Yanqi
Fan, Tuo
Cao, Zhao
Lu, Zhiwu
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT I, 2024, 14608 : 56 - 72
[9] Multi-Modal Multi-Task (3MT) Road Segmentation
Milli, Erkan
Erkent, Ozgur
Ylmaz, Asm Egemen
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (09) : 5408 - 5415
[10] Multi-Modal Meta Multi-Task Learning for Social Media Rumor Detection
Zhang, Huaiwen
Qian, Shengsheng
Fang, Quan
Xu, Changsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1449 - 1459

← 1 2 3 4 5 →