Transformer-based correlation mining network with self-supervised label generation for multimodal sentiment analysis

被引:1
|
作者
Wang, Ruiqing [1 ]
Yang, Qimeng [1 ]
Tian, Shengwei [1 ]
Yu, Long [2 ]
He, Xiaoyu [3 ]
Wang, Bo [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Network & Informat Ctr, Network, Xinjiang, Peoples R China
[3] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal sentiment analysis; Transformer; Multimodal fusion; Collaborative learning; FUSION;
D O I
10.1016/j.neucom.2024.129163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal Sentiment Analysis (MSA) aims to recognize and understand a speaker's sentiment state by integrating information from natural language, facial expressions, and voice, has gained much attention in recent years. However, modeling multimodal data poses two main challenges: 1) There are potential sentiment correlations between modalities and within contextual contexts, making it difficult to perform deep-level sentiment correlation mining and information fusion; 2) Sentiment information tends to be unevenly distributed across different modalities, posing challenges in fully leveraging information from each modality for collaborative learning. To address the above challenges, we propose CMLG based on correlation mining and label generation. This approach utilizes a Squeeze and Excitation Network (SEN) to recalibrate modality features and employs Transformer-based intra-modal and inter-modal feature extractors to mine the intrinsic connections between different modalities. In addition, we designed a Self-Supervised Label Generation Module (SLGM) that relies on the positive correlation between feature distances and label offsets to generate single-peak labels, and jointly train multi-peak and single-peak tasks to detect sentiment differences. Extensive experiments on three benchmark dataset (MOSI, MOSEI and SIMS) have shown that the above proposed method CMLG achieves excellent results.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] ATTENTION-GUIDED CONTRASTIVE MASKED IMAGE MODELING FOR TRANSFORMER-BASED SELF-SUPERVISED LEARNING
    Zhan, Yucheng
    Zhao, Yucheng
    Luo, Chong
    Zhang, Yueyi
    Sun, Xiaoyan
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2490 - 2494
  • [32] A superior image inpainting scheme using Transformer-based self-supervised attention GAN model
    Zhou, Meili
    Liu, Xiangzhen
    Yi, Tingting
    Bai, Zongwen
    Zhang, Pei
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
  • [33] Self-supervised Transformer-Based Pre-training Method with General Plant Infection Dataset
    Wang, Zhengle
    Wang, Ruifeng
    Wang, Minjuan
    Lai, Tianyun
    Zhang, Man
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT II, 2025, 15032 : 189 - 202
  • [34] A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection
    Le, Thanh-Dung
    Macabiau, Clara
    Albert, Kevin
    Jouvet, Philippe
    Noumeir, Rita
    IEEE ACCESS, 2024, 12 : 159860 - 159874
  • [35] Self-supervised representation learning using multimodal Transformer for emotion recognition
    Goetz, Theresa
    Arora, Pulkit
    Erick, F. X.
    Holzer, Nina
    Sawant, Shrutika
    PROCEEDINGS OF THE 8TH INTERNATIONAL WORKSHOP ON SENSOR-BASED ACTIVITY RECOGNITION AND ARTIFICIAL INTELLIGENCE, IWOAR 2023, 2023,
  • [36] Interpretability in Sentiment Analysis: A Self-Supervised Approach to Sentiment Cue Extraction
    Sun, Yawei
    He, Saike
    Han, Xu
    Luo, Yan
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [37] Multimodal sentiment analysis based on improved correlation representation network
    Yaermaimaiti, Yilihamu
    Yan, Tianxing
    Zhuang, Guohang
    Kari, Tusongjiang
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2024, 30 (06) : 679 - 698
  • [38] Self-HCL: Self-Supervised Multitask Learning with Hybrid Contrastive Learning Strategy for Multimodal Sentiment Analysis
    Fu, Youjia
    Fu, Junsong
    Xue, Huixia
    Xu, Zihao
    ELECTRONICS, 2024, 13 (14)
  • [39] A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
    Kotei, Evans
    Thirunavukarasu, Ramkumar
    INFORMATION, 2023, 14 (03)
  • [40] Liveness Detection in Computer Vision: Transformer-Based Self-Supervised Learning for Face Anti-Spoofing
    Keresh, Arman
    Shamoi, Pakizar
    IEEE ACCESS, 2024, 12 : 185673 - 185685