Emotion Label Encoding using Word Embeddings for Speech Emotion Recognition

被引:1
|
作者
Stanley, Eimear [1 ]
DeMattos, Eric [1 ]
Klementiev, Anita [1 ]
Ozimek, Piotr [1 ]
Clarke, Georgia [1 ]
Berger, Michael [1 ]
Palaz, Dimitri [1 ]
机构
[1] Speech Graph Ltd, Edinburgh, Midlothian, Scotland
来源
INTERSPEECH 2023 | 2023年
关键词
speech emotion recognition; word embeddings; label encoding; paralinguistics;
D O I
10.21437/Interspeech.2023-1591
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech Emotion Recognition (SER) is an important and challenging task for human-computer interaction. Human emotions are complex and nuanced, hence difficult to represent. The standard representations of emotions, categorical or continuous, tend to oversimplify the problem. Recently, the label encoding approach has been proposed, where vectors are used to represent the emotion space. In this paper, we hypothesise that using a pre-existing vector space that encodes semantic information about emotion is beneficial for the task. To this aim, we propose using word embeddings obtained from a Language Model (LM) as labels for SER. We evaluate the performance of the proposed approach on the IEMOCAP corpus and show that it yields better performance than a standard baseline. We also present a method to combine free text labels, which are unusable in conventional approaches, and by doing so we show that the model can learn more nuanced representations of emotions.
引用
收藏
页码:2418 / 2422
页数:5
相关论文
共 50 条
  • [1] Dimensional speech emotion recognition from speech features and word embeddings by using multitask learning
    Atmaja, Bagus Tris
    Akagi, Masato
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
  • [2] Speech Emotion Recognition Using Speech Feature and Word Embedding
    Atmaja, Bagus Tris
    Shirai, Kiyoaki
    Akagi, Masato
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
  • [3] Enhancing Speech Emotion Recognition Using Transfer Learning from Speaker Embeddings
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Kasak, Peter
    Spisiak, Michal
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 184 - 195
  • [4] Speech Emotion Recognition based on Multi-Label Emotion Existence Model
    Ando, Atsushi
    Masumura, Ryo
    Kamiyama, Havana
    Kobashikawa, Satoshi
    Aono, Yushi
    INTERSPEECH 2019, 2019, : 2818 - 2822
  • [5] Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
    Phukan, Orchid Chetia
    Buduru, Arun Balaji
    Sharma, Rajesh
    INTERSPEECH 2023, 2023, : 1903 - 1907
  • [6] The Emotion is Not One-hot Encoding: Learning with Grayscale Label for Emotion Recognition in Conversation
    Lee, Joosung
    INTERSPEECH 2022, 2022, : 141 - 145
  • [7] Emotion and Word Recognition for Unprocessed and Vocoded Speech Stimuli
    Morgan, Shae D.
    Garrard, Stacy
    Hoskins, Tiffany
    EAR AND HEARING, 2022, 43 (02): : 398 - 407
  • [8] A New Approach on Emotion Analogy by Using Word Embeddings
    Ucan, Alaettin
    Akcapinar Sezer, Ebru
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [9] Emotion Prompting for Speech Emotion Recognition
    Zhou, Xingfa
    Li, Min
    Yang, Lan
    Sun, Rui
    Wang, Xin
    Zhan, Huayi
    INTERSPEECH 2023, 2023, : 3108 - 3112
  • [10] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):