Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network

被引：27

作者：

Guo, Lili ^{[1
]}

Wang, Longbiao ^{[1
]}

Dang, Jianwu ^{[1
,2
]}

Zhang, Linjuan ^{[1
]}

Guan, Haotian ^{[3
]}

Li, Xiangang ^{[4
]}

机构：

[1] Tianjin Univ, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China

[2] Japan Adv Inst Sci & Technol, Nomi, Ishikawa, Japan

[3] Intelligent Spoken Language Technol Tianjin Co, Tianjin, Peoples R China

[4] Didi Chuxing, AI Labs, Beijing, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

基金：

中国国家自然科学基金;

关键词：

speech emotion recognition; amplitude; phase information; convolutional neural network;

D O I：

10.21437/Interspeech.2018-2156

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous studies of speech emotion recognition utilize convolutional neural network (CNN) directly on amplitude spectrogram to extract features. CNN combines with bidirectional long short term memory (BLSTM) has become the state-of-the-art model. However, phase information has been ignored in this model. The importance of phase information in speech processing field is gathering attention. In this paper, we propose feature extraction of amplitude spectrogram and phase information using CNN for speech emotion recognition. The modified group delay cepstral coefficient (MGDCC) and relative phase are used as phase information. Firstly, we analyze the influence of phase information on speech emotion recognition. Then we design a CNN-based feature representation using amplitude and phase information. Finally, experiments were conducted on EmoDB to validate the effectiveness of phase information. Integrating amplitude spectrogram with phase information, the relative emotion error recognition rates are reduced by over 33% in comparison with using only amplitude-based feature.

引用

页码：1611 / 1615

页数：5

共 50 条

[1] Speech emotion recognition based on spiking neural network and convolutional neural network
Du, Chengyan
Liu, Fu
Kang, Bing
Hou, Tao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
[2] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
Parthasarathy, Srinivas
Tashev, Ivan
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
[3] Speech Emotion Recognition Using Deep Convolutional Neural Network and Simple Recurrent Unit
Jiang, Pengxu
Fu, Hongliang
Tao, Huawei
ENGINEERING LETTERS, 2019, 27 (04) : 901 - 906
[4] Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition
Zhang, Linjuan
Wang, Longbiao
Dang, Jianwu
Guo, Lili
Guan, Haotian
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT IV, 2018, 11304 : 62 - 71
[5] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
Bhangale, Kishor
Kothandaraman, Mohanaprasad
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2341 - 2384
[6] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
Kishor Bhangale
Mohanaprasad Kothandaraman
Circuits, Systems, and Signal Processing, 2024, 43 : 2341 - 2384
[7] Multimodal speech emotion recognition and classification using convolutional neural network techniques
Christy, A.
Vaithyasubramanian, S.
Jesudoss, A.
Praveena, M. D. Anto
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 381 - 388
[8] Multimodal speech emotion recognition and classification using convolutional neural network techniques
A. Christy
S. Vaithyasubramanian
A. Jesudoss
M. D. Anto Praveena
International Journal of Speech Technology, 2020, 23 : 381 - 388
[9] Speech Emotion Recognition using Convolutional Neural Network with Audio Word-based Embedding
Huang, Kun-Yi
Wu, Chung-Hsien
Hong, Qian-Bei
Su, Ming-Hsiang
Zeng, Yuan-Rong
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 265 - 269
[10] Speech Emotion Recognition Using Convolutional Neural Networks with Attention Mechanism
Mountzouris, Konstantinos
Perikos, Isidoros
Hatzilygeroudis, Ioannis
Corchado, Juan M.
Iglesias, Carlos A.
Kim, Byung-Gyu
Mehmood, Rashid
Ren, Fuji
Lee, In
ELECTRONICS, 2023, 12 (20)

← 1 2 3 4 5 →