Robust Multimodal Representation under Uncertain Missing Modalities

被引:0
|
作者
Lan, Guilin [1 ]
Du, Yeqian [1 ]
Yang, Zhouwang [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
基金
国家重点研发计划;
关键词
Multimodal representation; Missing modalities; Multimodal sentiment analysis; Multimedia;
D O I
10.1145/3702003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multimodal representation learning has gained significant attention across various fields, yet it faces challenges when dealing with missing modalities in real-world applications. Existing solutions are confined to specific scenarios, such as single-modality missing or missing modalities in test cases, thereby restricting their applicability. To address a more general scenario of uncertain missing modalities in both training and testing framework projects each modality's representation into a shared subspace, enabling the reconstruction of any missing modalities within a unified model. We propose an interaction refinement module that utilizes cross-modal attention to enhance these reconstructions, particularly beneficial in scenarios with limited complete modality data. Furthermore, we introduce an iterative training strategy that alternately trains different modules to effectively utilize both complete and incomplete modality data. Experimental results on four benchmark datasets demonstrate the superiority of RMRU over existing baselines, particularly in scenarios with a high rate of missing modalities. Remarkably, our proposed RMRU can be broadly applied to diverse scenarios, regardless of modality types and quantities.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Multimodal Blockwise Transformer for Robust Sentiment Recognition
    Lai, Zhengqin
    Hong, Xiaopeng
    Wang, Yabin
    PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMODAL AND RESPONSIBLE AFFECTIVE COMPUTING, MRAC 2024, 2024, : 88 - 92
  • [32] Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation
    Bischke, Benjamin
    Helber, Patrick
    Koenig, Florian
    Borth, Damian
    Dengel, Andreas
    2018 16TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2018,
  • [33] Modality-Adaptive Feature Interaction for Brain Tumor Segmentation with Missing Modalities
    Zhao, Zechen
    Yang, Heran
    Sun, Jian
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 183 - 192
  • [34] A literature survey of MR-based brain tumor segmentation with missing modalities
    Zhou, Tongxue
    Ruan, Su
    Hu, Haigen
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2023, 104
  • [35] Learning Comprehensive Multimodal Representation for Cancer Survival Prediction
    Wu, Xingqi
    Shi, Yi
    Liu, Honglei
    Li, Ao
    Wang, Minghui
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 332 - 336
  • [36] A Multimodal Knowledge Representation Method for Fake News Detection
    Zeng, Fanhao
    Yao, Jiaxin
    Xu, Yijie
    Liu, Yanhua
    2024 4TH INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL AND ROBOTICS, ICCCR 2024, 2024, : 360 - 364
  • [37] TriSAT: Trimodal Representation Learning for Multimodal Sentiment Analysis
    Huan, Ruohong
    Zhong, Guowei
    Chen, Peng
    Liang, Ronghua
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4105 - 4120
  • [38] Building Robust Multimodal Sentiment Recognition via a Simple yet Effective Multimodal Transformer
    Zong, Daoming
    Ding, Chaoyue
    Li, Baoxiang
    Zhou, Dinghao
    Li, Jiakui
    Zheng, Ken
    Zhou, Qunyan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9596 - 9600
  • [39] Unimodal and Multimodal Integrated Representation Learning via Improved Information Bottleneck for Multimodal Sentiment Analysis
    Zhang, Tonghui
    Dong, Changfei
    Su, Jinsong
    Zhang, Haiying
    Li, Yuzheng
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 564 - 576
  • [40] Multimodal Reconstruct and Align Net for Missing Modality Problem in Sentiment Analysis
    Luo, Wei
    Xu, Mengying
    Lai, Hanjiang
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 411 - 422