Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation

被引:14
作者
Chen, Feiyu [1 ,2 ]
Shao, Jie [1 ,2 ]
Zhu, Shuyuan [1 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[2] Sichuan Artificial Intelligence Res Inst, Yibin, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.01036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Complex relationships of high arity across modality and context dimensions is a critical challenge in the Emotion Recognition in Conversation (ERC) task. Yet, previous works tend to encode multimodal and contextual relationships in a loosely-coupled manner, which may harm relationship modelling. Recently, Graph Neural Networks (GNN) which show advantages in capturing data relations, offer a new solution for ERC. However, existing GNN-based ERC models fail to address some general limits of GNNs, including assuming pairwise formulation and erasing high-frequency signals, which may be trivial for many applications but crucial for the ERC task. In this paper, we propose a GNN-based model that explores multivariate relationships and captures the varying importance of emotion discrepancy and commonality by valuing multi-frequency signals. We empower GNNs to better capture the inherent relationships among utterances and deliver more sufficient multimodal and contextual modelling. Experimental results show that our proposed method outperforms previous state-of-the-art works on two popular multimodal ERC datasets.
引用
收藏
页码:10761 / 10770
页数:10
相关论文
共 34 条
  • [11] Ghosal D, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P154
  • [12] MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis
    Hazarika, Devamanyu
    Zimmermann, Roger
    Poria, Soujanya
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1122 - 1131
  • [13] Hazarika D, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2594
  • [14] Hazarika Devamanyu, 2018, Proc Conf, V2018, P2122, DOI 10.18653/v1/n18-1193
  • [15] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation
    He, Xiangnan
    Deng, Kuan
    Wang, Xiang
    Li, Yan
    Zhang, Yongdong
    Wang, Meng
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 639 - 648
  • [16] Hoang NT, 2019, ABS190509550 CORR
  • [17] MM-DFN: MULTIMODAL DYNAMIC FUSION NETWORK FOR EMOTION RECOGNITION IN CONVERSATIONS
    Hu, Dou
    Hou, Xiaolong
    Wei, Lingwei
    Jiang, Lianxin
    Mo, Yang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7037 - 7041
  • [18] Hu Jingwen, 2021, P ANN M ASS COMP LIN, V1, P5666
  • [19] Hypoxia-Inducible Factor-Proline Hydroxylase Inhibitor in the Treatment of Renal Anemia
    Hu, Xiaofan
    Xie, Jingyuan
    Chen, Nan
    [J]. KIDNEY DISEASES, 2021, 7 (01) : 1 - 9
  • [20] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269