Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation

被引：14

作者：

Chen, Feiyu ^{[1
,2
]}

Shao, Jie ^{[1
,2
]}

Zhu, Shuyuan ^{[1
]}

Shen, Heng Tao ^{[1
,2
]}

机构：

[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China

[2] Sichuan Artificial Intelligence Res Inst, Yibin, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52729.2023.01036

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Complex relationships of high arity across modality and context dimensions is a critical challenge in the Emotion Recognition in Conversation (ERC) task. Yet, previous works tend to encode multimodal and contextual relationships in a loosely-coupled manner, which may harm relationship modelling. Recently, Graph Neural Networks (GNN) which show advantages in capturing data relations, offer a new solution for ERC. However, existing GNN-based ERC models fail to address some general limits of GNNs, including assuming pairwise formulation and erasing high-frequency signals, which may be trivial for many applications but crucial for the ERC task. In this paper, we propose a GNN-based model that explores multivariate relationships and captures the varying importance of emotion discrepancy and commonality by valuing multi-frequency signals. We empower GNNs to better capture the inherent relationships among utterances and deliver more sufficient multimodal and contextual modelling. Experimental results show that our proposed method outperforms previous state-of-the-art works on two popular multimodal ERC datasets.

引用

页码：10761 / 10770

页数：10

共 34 条

[1] Hypergraph convolution and hypergraph attention
Bai, Song
Zhang, Feihu
Torr, Philip H. S.
[J]. PATTERN RECOGNITION, 2021, 110
[2] Bo DY, 2021, AAAI CONF ARTIF INTE, V35, P3950
[3] IEMOCAP: interactive emotional dyadic motion capture database
Busso, Carlos
Bulut, Murtaza
Lee, Chi-Chun
Kazemzadeh, Abe
Mower, Emily
Kim, Samuel
Chang, Jeannette N.
Lee, Sungbok
Narayanan, Shrikanth S.
[J]. LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (04) : 335 - 359
[4] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
Chi, Hyung-gun
Ha, Myoung Hoon
Chi, Seunggeun
Lee, Sang Wan
Huang, Qixing
Ramani, Karthik
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20154 - 20164
[5] Chitra U, 2019, PR MACH LEARN RES, V97
[6] AdaGNN: Graph Neural Networks with Adaptive Frequency Response Filter
Dong, Yushun
Ding, Kaize
Jalaian, Brian
Ji, Shuiwang
Li, Jundong
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 392 - 401
[7] Eyben F., 2010, 18 ACM INT C MUL MM, P1459, DOI DOI 10.1145/1873951.1874246
[8] Feng YF, 2019, AAAI CONF ARTIF INTE, P3558
[9] Energy savings through innovative and automated freight trains
Gattuso, Domenico
Cassone, Gian Carla
Mai, Serge
[J]. EUROPEAN TRANSPORT-TRASPORTI EUROPEI, 2022, (87):
[10] Ghosal D, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P2470

← 1 2 3 4 →