Attention-optimized vision-enhanced prompt learning for few-shot multi-modal sentiment analysis

被引:0
|
作者
Zhou, Zikai [1 ]
Qiao, Baiyou [1 ]
Feng, Haisong [2 ]
Han, Donghong [1 ]
Wu, Gang [1 ]
机构
[1] School of Computer Science and Engineering, Northeastern University, Shenyang
[2] School of Informatics, Xiamen University, Xiamen
基金
中国国家自然科学基金;
关键词
Few-shot learning; GCN; Multi-modal sentiment analysis; Prompt learning;
D O I
10.1007/s00521-024-10297-w
中图分类号
学科分类号
摘要
To fulfill the explosion of multi-modal data, multi-modal sentiment analysis (MSA) emerged and attracted widespread attention. Unfortunately, conventional multi-modal research relies on large-scale datasets. On the one hand, collecting and annotating large-scale datasets is challenging and resource-intensive. On the other hand, the training on large-scale datasets also increases the research cost. However, the few-shot MSA (FMSA), which is proposed recently, requires only few samples for training. Therefore, in comparison, it is more practical and realistic. There have been approaches to investigating the prompt-based method in the field of FMSA, but they have not sufficiently considered or leveraged the information specificity of visual modality. Thus, we propose a vision-enhanced prompt-based model based on graph structure to better utilize vision information for fusion and collaboration in encoding and optimizing prompt representations. Specifically, we first design an aggregation-based multi-modal attention module. Then, based on this module and the biaffine attention, we construct a syntax–semantic dual-channel graph convolutional network to optimize the encoding of learnable prompts by understanding the vision-enhanced information in semantic and syntactic knowledge. Finally, we propose a collaboration-based optimization module based on the collaborative attention mechanism, which employs visual information to collaboratively optimize prompt representations. Extensive experiments conducted on both coarse-grained and fine-grained MSA datasets have demonstrated that our model significantly outperforms the baseline models. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:21091 / 21105
页数:14
相关论文
共 50 条
  • [1] Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning
    Yu, Yang
    Zhang, Dong
    Li, Shoushan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [2] DISCRIMINATIVE HALLUCINATION FOR MULTI-MODAL FEW-SHOT LEARNING
    Pahde, Frederik
    Nabi, Moin
    Klein, Tassilo
    Jaehnichen, Patrick
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 156 - 160
  • [3] Few-shot Learning for Multi-modal Social Media Event Filtering
    Nascimento, Jose
    Cardenuto, Joao Phillipe
    Yang, Jing
    Rocha, Anderson
    2022 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2022,
  • [4] Enhanced Prompt Learning for Few-shot Text Classification Method
    Li R.
    Wei Z.
    Fan Y.
    Ye S.
    Zhang G.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2024, 60 (01): : 1 - 12
  • [5] Knowledge-Enhanced Prompt Learning for Few-Shot Text Classification
    Liu, Jinshuo
    Yang, Lu
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (04)
  • [6] Supervised and Few-Shot Learning for Aspect-Based Sentiment Analysis of Instruction Prompt
    Huang, Jie
    Cui, Yunpeng
    Liu, Juan
    Liu, Ming
    ELECTRONICS, 2024, 13 (10)
  • [7] Few-shot adaptation of multi-modal foundation models: a survey
    Liu, Fan
    Zhang, Tianshu
    Dai, Wenwen
    Zhang, Chuanyi
    Cai, Wenwen
    Zhou, Xiaocong
    Chen, Delong
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (10)
  • [8] Few-shot multi-modal registration with mono-modal knowledge transfer
    Wang, Peng
    Guo, Yi
    Wang, Yuanyuan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [9] Adaptive multimodal prompt-tuning model for few-shot multimodal sentiment analysis
    Xiang, Yan
    Zhang, Anlan
    Guo, Junjun
    Huang, Yuxin
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
  • [10] An Adaptive Dual-channel Multi-modal graph neural network for few-shot learning
    Yang, Jieyi
    Dong, Yihong
    Li, Guoqing
    KNOWLEDGE-BASED SYSTEMS, 2025, 310