Robust Multimodal Representation under Uncertain Missing Modalities

被引：0

作者：

Lan, Guilin ^{[1
]}

Du, Yeqian ^{[1
]}

Yang, Zhouwang ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2025年 / 21卷 / 01期

基金：

国家重点研发计划;

关键词：

Multimodal representation; Missing modalities; Multimodal sentiment analysis; Multimedia;

D O I：

10.1145/3702003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multimodal representation learning has gained significant attention across various fields, yet it faces challenges when dealing with missing modalities in real-world applications. Existing solutions are confined to specific scenarios, such as single-modality missing or missing modalities in test cases, thereby restricting their applicability. To address a more general scenario of uncertain missing modalities in both training and testing framework projects each modality's representation into a shared subspace, enabling the reconstruction of any missing modalities within a unified model. We propose an interaction refinement module that utilizes cross-modal attention to enhance these reconstructions, particularly beneficial in scenarios with limited complete modality data. Furthermore, we introduce an iterative training strategy that alternately trains different modules to effectively utilize both complete and incomplete modality data. Experimental results on four benchmark datasets demonstrate the superiority of RMRU over existing baselines, particularly in scenarios with a high rate of missing modalities. Remarkably, our proposed RMRU can be broadly applied to diverse scenarios, regardless of modality types and quantities.

引用

页数：23

共 50 条

[41] Adaptive Fusion and Edge-Oriented Enhancement for Brain Tumor Segmentation With Missing Modalities
Yan, Yulan
Zhan, Yinwei
He, Huiyao
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (01)
[42] Generative learning-based lightweight MRI brain tumor segmentation with missing modalities
Zhang, Xinliang
Chen, Qian
He, Hangzhou
Zhu, Lei
Xie, Zhaoheng
Lu, Yanye
Cheng, Fangxiao
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 261
[43] ACN: Adversarial Co-training Network for Brain Tumor Segmentation with Missing Modalities
Wang, Yixin
Zhang, Yang
Liu, Yang
Lin, Zihao
Tian, Jiang
Zhong, Cheng
Shi, Zhongchao
Fan, Jianping
He, Zhiqiang
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VII, 2021, 12907 : 410 - 420
[44] Efficient Multimodal Transformer With Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis
Sun, Licai
Lian, Zheng
Liu, Bin
Tao, Jianhua
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (01) : 309 - 325
[45] SCANET: Improving multimodal representation and fusion with sparse- and cross-attention for multimodal sentiment analysis
Wang, Hao
Yang, Mingchuan
Li, Zheng
Liu, Zhenhua
Hu, Jie
Fu, Ziwang
Liu, Feng
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[46] The multimodal representation of emotion in film: Integrating cognitive and semiotic approaches
Feng, Dezheng
O'Halloran, Kay L.
SEMIOTICA, 2013, 197 : 79 - 100
[47] DRLN: Disentangled Representation Learning Network for Multimodal Sentiment Analysis
Hou, Jingming
Omar, Nazlia
Tiun, Sabrina
Saad, Saidah
He, Qian
NEURAL COMPUTING FOR ADVANCED APPLICATIONS, NCAA 2024, PT III, 2025, 2183 : 148 - 161
[48] Consumption as a Rhythm: A Multimodal Experiment on the Representation of Time-Series
Macas, Catarina
Martins, Pedro
Machado, Penousal
2018 22ND INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV), 2018, : 504 - 509
[49] Learning speaker-independent multimodal representation for sentiment analysis
Wang, Jianwen
Wang, Shiping
Lin, Mingwei
Xu, Zeshui
Guo, Wenzhong
INFORMATION SCIENCES, 2023, 628 : 208 - 225
[50] Improving multimodal action representation with joint motion history context
Malawski, Filip
Kwolek, Bogdan
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 61 : 198 - 208

← 1 2 3 4 5 →