Emotion Label Encoding using Word Embeddings for Speech Emotion Recognition

被引：1

作者：

Stanley, Eimear ^{[1
]}

DeMattos, Eric ^{[1
]}

Klementiev, Anita ^{[1
]}

Ozimek, Piotr ^{[1
]}

Clarke, Georgia ^{[1
]}

Berger, Michael ^{[1
]}

Palaz, Dimitri ^{[1
]}

机构：

[1] Speech Graph Ltd, Edinburgh, Midlothian, Scotland

来源：

INTERSPEECH 2023 | 2023年

关键词：

speech emotion recognition; word embeddings; label encoding; paralinguistics;

D O I：

10.21437/Interspeech.2023-1591

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech Emotion Recognition (SER) is an important and challenging task for human-computer interaction. Human emotions are complex and nuanced, hence difficult to represent. The standard representations of emotions, categorical or continuous, tend to oversimplify the problem. Recently, the label encoding approach has been proposed, where vectors are used to represent the emotion space. In this paper, we hypothesise that using a pre-existing vector space that encodes semantic information about emotion is beneficial for the task. To this aim, we propose using word embeddings obtained from a Language Model (LM) as labels for SER. We evaluate the performance of the proposed approach on the IEMOCAP corpus and show that it yields better performance than a standard baseline. We also present a method to combine free text labels, which are unusable in conventional approaches, and by doing so we show that the model can learn more nuanced representations of emotions.

引用

页码：2418 / 2422

页数：5

共 50 条

[1] Dimensional speech emotion recognition from speech features and word embeddings by using multitask learning
Atmaja, Bagus Tris
Akagi, Masato
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
[2] Speech Emotion Recognition Using Speech Feature and Word Embedding
Atmaja, Bagus Tris
Shirai, Kiyoaki
Akagi, Masato
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
[3] Enhancing Speech Emotion Recognition Using Transfer Learning from Speaker Embeddings
Jakubec, Maros
Jarina, Roman
Lieskovska, Eva
Kasak, Peter
Spisiak, Michal
TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 184 - 195
[4] Speech Emotion Recognition based on Multi-Label Emotion Existence Model
Ando, Atsushi
Masumura, Ryo
Kamiyama, Havana
Kobashikawa, Satoshi
Aono, Yushi
INTERSPEECH 2019, 2019, : 2818 - 2822
[5] Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
Phukan, Orchid Chetia
Buduru, Arun Balaji
Sharma, Rajesh
INTERSPEECH 2023, 2023, : 1903 - 1907
[6] The Emotion is Not One-hot Encoding: Learning with Grayscale Label for Emotion Recognition in Conversation
Lee, Joosung
INTERSPEECH 2022, 2022, : 141 - 145
[7] Emotion and Word Recognition for Unprocessed and Vocoded Speech Stimuli
Morgan, Shae D.
Garrard, Stacy
Hoskins, Tiffany
EAR AND HEARING, 2022, 43 (02): : 398 - 407
[8] A New Approach on Emotion Analogy by Using Word Embeddings
Ucan, Alaettin
Akcapinar Sezer, Ebru
2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
[9] Emotion Prompting for Speech Emotion Recognition
Zhou, Xingfa
Li, Min
Yang, Lan
Sun, Rui
Wang, Xin
Zhan, Huayi
INTERSPEECH 2023, 2023, : 3108 - 3112
[10] Speech emotion recognition using emotion perception spectral feature
Jiang, Lin
Tan, Ping
Yang, Junfeng
Liu, Xingbao
Wang, Chao
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):

← 1 2 3 4 5 →