Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation

被引:0
|
作者
Reza, Md Kaykobad [1 ]
Prater-Bennette, Ashley [2 ]
Asif, M. Salman [1 ]
机构
[1] Univ Calif Riverside, Riverside, CA 92508 USA
[2] US Air Force, Res Lab, Rome, NY 13441 USA
关键词
Adaptation models; Training; Computational modeling; Robustness; Modulation; Transforms; Sentiment analysis; Data models; Solid modeling; Knowledge engineering; Missing modality adaptation; missing modality robustness; parameter-efficient adaptation; robust multimodal learning;
D O I
10.1109/TPAMI.2024.3476487
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modalities are absent at test time. To enable robustness to missing modalities, we propose a simple and parameter-efficient adaptation procedure for pretrained multimodal networks. In particular, we exploit modulation of intermediate features to compensate for the missing modalities. We demonstrate that such adaptation can partially bridge performance drop due to missing modalities and outperform independent, dedicated networks trained for the available modality combinations in some cases. The proposed adaptation requires extremely small number of parameters (e.g., fewer than 1% of the total parameters) and applicable to a wide range of modality combinations and tasks. We conduct a series of experiments to highlight the missing modality robustness of our proposed method on five different multimodal tasks across seven datasets. Our proposed method demonstrates versatility across various tasks and datasets, and outperforms existing methods for robust multimodal learning with missing modalities.
引用
收藏
页码:742 / 754
页数:13
相关论文
共 15 条
  • [1] Robust Multimodal Sentiment Analysis via Tag Encoding of Uncertain Missing Modalities
    Zeng, Jiandian
    Zhou, Jiantao
    Liu, Tianyi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6301 - 6314
  • [2] PERS: Parameter-Efficient Multimodal Transfer Learning for Remote Sensing Visual Question Answering
    He, Jinlong
    Liu, Gang
    Li, Pengfei
    Su, Xiaonan
    Jiang, Wenhua
    Zhang, Dongze
    Zhong, Shenjun
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 14823 - 14835
  • [3] Parameter-Efficient Transfer Learning for Medical Visual Question Answering
    Liu, Jiaxiang
    Hu, Tianxiang
    Zhang, Yan
    Feng, Yang
    Hao, Jin
    Lv, Junhui
    Liu, Zuozhu
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2816 - 2826
  • [4] AiRs: Adapter in Remote Sensing for Parameter-Efficient Transfer Learning
    Hu, Leiyi
    Yu, Hongfeng
    Lu, Wanxuan
    Yin, Dongshuo
    Sun, Xian
    Fu, Kun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
  • [5] Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing
    Marsh, Peter
    Kuruoglu, Ercan Engin
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (06) : 1010 - 1023
  • [6] Leveraging Low-Rank Adaptation for Parameter-Efficient Fine-Tuning in Multi-Speaker Adaptive Text-to-Speech Synthesis
    Hong, Changi
    Lee, Jung Hyuk
    Kim, Hong Kook
    IEEE ACCESS, 2024, 12 : 190711 - 190727
  • [7] FedDMC: Efficient and Robust Federated Learning via Detecting Malicious Clients
    Mu, Xutong
    Cheng, Ke
    Shen, Yulong
    Li, Xiaoxiao
    Chang, Zhao
    Zhang, Tao
    Ma, Xindi
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (06) : 5259 - 5274
  • [8] Contrastive Learning Based Modality-Invariant Feature Acquisition for Robust Multimodal Emotion Recognition With Missing Modalities
    Liu, Rui
    Zuo, Haolin
    Lian, Zheng
    Schuller, Bjorn W.
    Li, Haizhou
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (04) : 1856 - 1873
  • [9] A COMPARISON OF PARAMETER-EFFICIENT ASR DOMAIN ADAPTATION METHODS FOR UNIVERSAL SPEECH AND LANGUAGE MODELS
    Sim, Khe Chai
    Huo, Zhouyuan
    Munkhdalai, Tsendsuren
    Siddhartha, Nikhil
    Stooke, Adam
    Meng, Zhong
    Li, Bo
    Sainath, Tara
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6900 - 6904
  • [10] FedITD: A Federated Parameter-Efficient Tuning With Pre-Trained Large Language Models and Transfer Learning Framework for Insider Threat Detection
    Wang, Zhi Qiang
    Wang, Haopeng
    El Saddik, Abdulmotaleb
    IEEE ACCESS, 2024, 12 : 160396 - 160417