Disentangled Variational Autoencoder for Emotion Recognition in Conversations

被引：5

作者：

Yang, Kailai ^{[1
]}

Zhang, Tianlin ^{[1
]}

Ananiadou, Sophia ^{[1
]}

机构：

[1] Univ Manchester, Dept Comp Sci, NaCTeM, Manchester M13 9PL, England

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2024年 / 15卷 / 02期

基金：

英国生物技术与生命科学研究理事会;

关键词：

Task analysis; Emotion recognition; Hidden Markov models; Context modeling; Decoding; Oral communication; Gaussian distribution; Emotion recognition in conversations; variational autoencoder; valence-arousal-dominance; disentangled representations; DIALOGUE;

D O I：

10.1109/TAFFC.2023.3280038

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In Emotion Recognition in Conversations (ERC), the emotions of target utterances are closely dependent on their context. Therefore, existing works train the model to generate the response of the target utterance, which aims to recognise emotions leveraging contextual information. However, adjacent response generation ignores long-range dependencies and provides limited affective information in many cases. In addition, most ERC models learn a unified distributed representation for each utterance, which lacks interpretability and robustness. To address these issues, we propose a VAD-disentangled Variational AutoEncoder (VAD-VAE), which first introduces a target utterance reconstruction task based on Variational Autoencoder, then disentangles three affect representations Valence-Arousal-Dominance (VAD) from the latent space. We also enhance the disentangled representations by introducing VAD supervision signals from a sentiment lexicon and minimising the mutual information between VAD distributions. Experiments show that VAD-VAE outperforms the state-of-the-art model on two datasets. Further analysis proves the effectiveness of each proposed module and the quality of disentangled VAD representations.

引用

页码：508 / 518

页数：11

共 50 条

[31] Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations
Meng, Tao
Shou, Yuntao
Ai, Wei
Yin, Nan
Li, Keqin
IEEE Transactions on Artificial Intelligence, 2024, 5 (12): : 1 - 15
[32] Learning Disentangled Representations with the Wasserstein Autoencoder
Gaujac, Benoit
Feige, Ilya
Barber, David
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 : 69 - 84
[33] MM-DFN: MULTIMODAL DYNAMIC FUSION NETWORK FOR EMOTION RECOGNITION IN CONVERSATIONS
Hu, Dou
Hou, Xiaolong
Wei, Lingwei
Jiang, Lianxin
Mo, Yang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7037 - 7041
[34] DISENTANGLED SPEECH REPRESENTATION LEARNING BASED ON FACTORIZED HIERARCHICAL VARIATIONAL AUTOENCODER WITH SELF-SUPERVISED OBJECTIVE
Xie, Yuying
Arildsen, Thomas
Tan, Zheng-Hua
2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
[35] Intelligent optimal feature selection-based hybrid variational autoencoder and block recurrent transformer network for accurate emotion recognition model using EEG signals
Reddy, C. H. Narsimha
Mahesh, Shanthi
Manjunathachari, K.
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1027 - 1039
[36] Identity-Aware Variational Autoencoder for Face Swapping
Li, Zonglin
Zhang, Zhaoxin
He, Shengfeng
Meng, Quanling
Zhang, Shengping
Zhong, Bineng
Ji, Rongrong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5466 - 5479
[37] Intelligent optimal feature selection-based hybrid variational autoencoder and block recurrent transformer network for accurate emotion recognition model using EEG signals
C. H. Narsimha Reddy
Shanthi Mahesh
K. Manjunathachari
Signal, Image and Video Processing, 2024, 18 : 1027 - 1039
[38] MODELING GENDER INFORMATION FOR EMOTION RECOGNITION USING DENOISING AUTOENCODER
Xia, Rui
Deng, Jun
Schuller, Bjoern
Liu, Yang
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[39] Emotion recognition in conversations with emotion shift detection based on multi-task learning
Gao, Qingqing
Cao, Biwei
Guan, Xin
Gu, Tianyun
Bao, Xing
Wu, Junyan
Liu, Bo
Cao, Jiuxin
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[40] Speech Emotion Recognition Considering Nonverbal Vocalization in Affective Conversations
Hsu, Jia-Hao
Su, Ming-Hsiang
Wu, Chung-Hsien
Chen, Yi-Hsuan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 (29) : 1675 - 1686

← 1 2 3 4 5 →