EHR-based prediction modelling meets multimodal deep learning: A systematic review of structured and textual data fusion methods

被引:1
作者
Teles, Ariel Soares [1 ,2 ]
de Moura, Ivan Rodrigues [3 ]
Silva, Francisco [4 ]
Roberts, Angus [1 ]
Stahl, Daniel
机构
[1] Kings Coll London, Inst Psychiat Psychol & Neurosci, Dept Biostat & Hlth Informat, London, England
[2] Fed Inst Maranhao, Sao Luis, Maranhao, Brazil
[3] Fed Inst Piaui, Teresina, Piaui, Brazil
[4] Univ Fed Maranhao, Maranhao, Maranhao, Brazil
关键词
Multimodal; Deep learning; Electronic Health Records; Predictive modelling; Data fusion; Natural Language Processing; Medical AI; ELECTRONIC HEALTH RECORDS; ARTIFICIAL-INTELLIGENCE; DISEASE PREDICTION; INFORMATION; AGREEMENT; KAPPA; RISK;
D O I
10.1016/j.inffus.2025.102981
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Electronic Health Records (EHRs) have transformed healthcare by digitally consolidating patient medical history, encompassing structured data (e.g., demographic data, lab results), and unstructured textual data (e.g., clinical notes). These data hold significant potential for predictive modelling, and recent studies have dedicated efforts to leverage the different modalities in a cohesive and effective manner to improve predictive accuracy. This Systematic Literature Review (SLR) addresses the application of Multimodal Deep Learning (MDL) methods in EHR-based prediction modelling, specifically through the fusion of structured and textual data. Following PRISMA guidelines, we conducted a comprehensive literature search across six article databases, using a carefully designed search string. After applying inclusion and exclusion criteria, we selected 77 primary studies. Data extraction was standardized using a structured form based on the CHARMS checklist. We categorized and analysed the fusion strategies employed across the studies. By combining structured and textual data at the input level, early fusion enabled models to learn joint feature representations from the beginning, whether in vectorized representations or data textualization. Intermediate fusion, which delays integration, was particularly useful for tasks where each modality provides unique insights that need to be processed independently before being combined. Late fusion enabled modularity by integrating outputs from unimodal models, which is suitable when EHR structured and textual data have varying quality or reliability. We also identified trends and open issues that need attention. This review contributes a comprehensive understanding of EHR data fusion practices using MDL, highlighting potential pathways for future research and development in health informatics.
引用
收藏
页数:24
相关论文
共 183 条
[21]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[22]   The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[23]  
Bojanowski P, 2017, Arxiv, DOI [arXiv:1607.04606, 10.48550/arXiv.1607.04606]
[24]   The need to separate the wheat from the chaff in medical informatics Introducing a comprehensive checklist for the (self)-assessment of medical AI studies [J].
Cabitza, Federico ;
Campagner, Andrea .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 153
[25]   A Survey on Multimodal Data-Driven Smart Healthcare Systems: Approaches and Applications [J].
Cai, Qiong ;
Wang, Hao ;
Li, Zhenmin ;
Liu, Xiao .
IEEE ACCESS, 2019, 7 :133583-133599
[26]   Fairness in Machine Learning: A Survey [J].
Caton, Simon ;
Haas, Christian .
ACM COMPUTING SURVEYS, 2024, 56 (07) :1-38
[27]   Unmasking bias in artificial intelligence: a systematic review of bias detection and mitigation strategies in electronic health record-based models [J].
Chen, Feng ;
Wang, Liqin ;
Hong, Julie ;
Jiang, Jiaqi ;
Zhou, Li .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (05) :1172-1183
[28]   Enhancing early autism prediction based on electronic records using clinical narratives [J].
Chen, Junya ;
Engelhard, Matthew ;
Henao, Ricardo ;
Berchuck, Samuel ;
Eichner, Brian ;
Perrin, Eliana M. ;
Sapiro, Guillermo ;
Dawson, Geraldine .
JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 144
[29]   Disease Prediction by Machine Learning Over Big Data From Healthcare Communities [J].
Chen, Min ;
Hao, Yixue ;
Hwang, Kai ;
Wang, Lu ;
Wang, Lin .
IEEE ACCESS, 2017, 5 :8869-8879
[30]   Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation [J].
Chen, Pei-Fu ;
Chen, Lichin ;
Lin, Yow-Kuan ;
Li, Guo-Hung ;
Lai, Feipei ;
Lu, Cheng-Wei ;
Yang, Chi-Yu ;
Chen, Kuan-Chih ;
Lin, Tzu-Yu .
JMIR MEDICAL INFORMATICS, 2022, 10 (05)