A Cross-Modal Correlation Fusion Network for Emotion Recognition in Conversations

被引:0
|
作者
Tang, Xiaolyu [1 ]
Cai, Guoyong [1 ]
Chen, Ming [1 ]
Yuan, Peicong [1 ]
机构
[1] Guilin Univ Elect Technol, Guilin, Peoples R China
来源
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024 | 2025年 / 15363卷
基金
中国国家自然科学基金;
关键词
Emotion Recognition in Conversations; Multimodal fusion; Cross-modal correlations; MULTIMODAL FUSION;
D O I
10.1007/978-981-97-9443-0_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of Emotion Recognition in Conversations (ERC) is to predict the emotions conveyed in the utterances of a conversation. In this paper, we propose a Cross-Modal Correlation Fusion Network (CMCFN), which addresses the limitations of existing approaches to exploit correlations across multiple modalities and the difficulty of classifying tail emotion categories. The proposed Cross-Modal Correlation Encoder (CMCE) effectively models intricate cross-modal correlations in a conversation, facilitating efficient multimodal fusion. In addition, the designed Multimodal Contrastive Representation Learning Network (MCRLN) mitigates the difficulty in categorizing tail emotions by combining supervised contrastive learning and multimodal data augmentation. Experimental results on the IEMOCAP and MELD datasets demonstrate the effectiveness and superiority of our proposed CMCFN model.
引用
收藏
页码:55 / 68
页数:14
相关论文
共 50 条
  • [1] Hierarchical Cross-Modal Interaction and Fusion Network Enhanced with Self-Distillation for Emotion Recognition in Conversations
    Wei, Puling
    Yang, Juan
    Xiao, Yali
    ELECTRONICS, 2024, 13 (13)
  • [2] A cross-modal fusion network based on graph feature learning for multimodal emotion recognition
    Cao Xiaopeng
    Zhang Linying
    Chen Qiuxian
    Ning Hailong
    Dong Yizhuo
    The Journal of China Universities of Posts and Telecommunications, 2024, 31 (06) : 16 - 25
  • [3] Speech Emotion Recognition Using Global-Aware Cross-Modal Feature Fusion Network
    Li, Feng
    Luo, Jiusong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 211 - 221
  • [4] CFN-ESA: A Cross-Modal Fusion Network With Emotion-Shift Awareness for Dialogue Emotion Recognition
    Li J.
    Wang X.
    Liu Y.
    Zeng Z.
    IEEE Transactions on Affective Computing, 2024, 15 (04): : 1 - 16
  • [5] Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition
    Zhao, Huan
    Li, Bo
    Zhang, Zixing
    INTERSPEECH 2023, 2023, : 2718 - 2722
  • [6] CMFN: Cross-Modal Fusion Network for Irregular Scene Text Recognition
    Zheng, Jinzhi
    Ji, Ruyi
    Zhang, Libo
    Wu, Yanjun
    Zhao, Chen
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI, 2024, 14452 : 421 - 433
  • [7] CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network
    Peng, Yuxin
    Qi, Jinwei
    Huang, Xin
    Yuan, Yuxin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (02) : 405 - 420
  • [8] A multimodal shared network with a cross-modal distribution constraint for continuous emotion recognition
    Li, Chiqin
    Xie, Lun
    Shao, Xingmao
    Pan, Hang
    Wang, Zhiliang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [9] Kernel Cross-Modal Factor Analysis for Information Fusion With Application to Bimodal Emotion Recognition
    Wang, Yongjin
    Guan, Ling
    Venetsanopoulos, Anastasios N.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (03) : 597 - 607
  • [10] Research on cross-modal emotion recognition based on multi-layer semantic fusion
    Xu Z.
    Gao Y.
    Mathematical Biosciences and Engineering, 2024, 21 (02) : 2488 - 2514