Autoencoder with emotion embedding for speech emotion recognition

被引：0

作者：

Zhang, Chenghao ^{[1
]}

Xue, Lei ^{[1
]}

机构：

[1] School of Communication and Information Engineering, Shanghai University, Shanghai,200444, China

来源：

IEEE Access | 2021年 / 9卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

An important part of the human-computer interaction process is speech emotion recognition (SER), which has been receiving more attention in recent years. However, although a wide diversity of methods has been proposed in SER, these approaches still cannot improve the performance. A key issue in the low performance of the SER system is how to effectively extract emotion-oriented features. In this paper, we propose a novel algorithm, an autoencoder with emotion embedding, to extract deep emotion features. Unlike many previous works, instance normalization, which is a common technique in the style transfer field, is introduced into our model rather than batch normalization. Furthermore, the emotion embedding path in our method can lead the autoencoder to efficiently learn a priori knowledge from the label. It can enable the model to distinguish which features are most related to human emotion. We concatenate the latent representation learned by the autoencoder and acoustic features obtained by the openSMILE toolkit. Finally, the concatenated feature vector is utilized for emotion classification. To improve the generalization of our method, a simple data augmentation approach is applied. Two publicly available and highly popular databases, IEMOCAP and EMODB, are chosen to evaluate our method. Experimental results demonstrate that the proposed model achieves significant performance improvement compared to other speech emotion recognition systems. © 2013 IEEE.

引用

页码：51231 / 51241

共 50 条

[1] Autoencoder With Emotion Embedding for Speech Emotion Recognition
Zhang, Chenghao
Xue, Lei
IEEE ACCESS, 2021, 9 : 51231 - 51241
[2] Speech Emotion Recognition 'in the wild' Using an Autoencoder
Dissanayake, Vipula
Zhang, Haimo
Billinghurst, Mark
Nanayakkara, Suranga
INTERSPEECH 2020, 2020, : 526 - 530
[3] Two-stream Emotion-embedded Autoencoder for Speech Emotion Recognition
Zhang, Chenghao
Xue, Lei
2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 969 - 974
[4] Sparse Autoencoder with Attention Mechanism for Speech Emotion Recognition
Sun, Ting-Wei
Wu, An-Yeu
2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 146 - 149
[5] A VECTOR QUANTIZED MASKED AUTOENCODER FOR SPEECH EMOTION RECOGNITION
Sadok, Samir
Leglaive, Simon
Seguier, Renaud
2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
[6] Speech Emotion Recognition Using Speech Feature and Word Embedding
Atmaja, Bagus Tris
Shirai, Kiyoaki
Akagi, Masato
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
[7] Speech Emotion Recognition Using Spectrogram & Phoneme Embedding
Yenigalla, Promod
Kumar, Abhay
Tripathi, Suraj
Singh, Chirag
Kar, Sibsambhu
Vepa, Jithendra
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3688 - 3692
[8] Performance Evaluation of Deep Autoencoder Network for Speech Emotion Recognition
AndleebSiddiqui, Maria
Hussain, Wajahat
Ali, Syed Abbas
Danish-ur-Rehman
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 606 - 611
[9] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
Ying, Yangwei
Tu, Yuanwu
Zhou, Hong
ELECTRONICS, 2021, 10 (17)
[10] SPEECH EMOTION RECOGNITION USING AUTOENCODER BOTTLENECK FEATURES AND LSTM
Huang, Kun-Yi
Wu, Chung-Hsien
Yang, Tsung-Hsien
Su, Ming-Hsiang
Chou, Jia-Hui
2016 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT), 2018, : 1 - 4

← 1 2 3 4 5 →