Speech Emotion Recognition Based on Coiflet Wavelet Packet Cepstral Coefficients

被引:0
作者
Huang, Yongming [1 ,2 ]
Wu, Ao [1 ,2 ]
Zhang, Guobao [1 ,2 ]
Li, Yue [1 ,2 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Jiangsu, Peoples R China
[2] Minist Educ, Key Lab Measurement & Control Complex Syst Engn, Beijing, Peoples R China
来源
PATTERN RECOGNITION (CCPR 2014), PT II | 2014年 / 484卷
关键词
Speech emotion recognition; Coiflet Wavelet packets Cepstral Coefficients (CWPCC); Acoustic features;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A wavelet packet based adaptive filter-bank construction method is proposed for speech signal processing in this paper. On this basis, a set of acoustic features are proposed for speech emotion recognition, namely Coiflet Wavelet Packet Cepstral Coefficients (CWPCC). CWPCC extends the conventional Mel-Frequency Cepstral Coefficients (MFCC) by adapting the filter-bank structure according to the decision task; Speech emotion recognition system is constructed with the proposed feature set and Gaussian mixture model as classifier. Experimental results on Berlin emotional speech database show that the Coiflet Wavelet Packet is more suitable in speech emotion recognition than other Wavelet Packets and proposed features improve emotion recognition performance over the conventional features.
引用
收藏
页码:436 / 443
页数:8
相关论文
共 12 条
[1]  
[Anonymous], 2009, A Wavelet Tour of Signal Processing
[2]  
Burkhardt F., 2005, INTERSPEECH, V5, P1517, DOI DOI 10.21437/INTERSPEECH.2005-446
[3]  
Caponetti L., EURASIP J ADV SIGNAL
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]  
Daubechies I., 1992, Ten lectures on wavelets, DOI DOI 10.1137/1.9781611970104
[6]   Acoustical properties of speech as indicators of depression and suicidal risk [J].
France, DJ ;
Shiavi, RG ;
Silverman, S ;
Silverman, M ;
Wilkes, DM .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2000, 47 (07) :829-837
[7]   Multimodal estimation of a driver's spontaneous irritation [J].
Malta, Lucas ;
Miyajima, Chiyomi ;
Kitaoka, Norihide ;
Takeda, Kazuya .
2009 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1 AND 2, 2009, :573-577
[8]   Ensemble methods for spoken emotion recognition in call-centres [J].
Morrison, Donn ;
Wang, Ruili ;
De Silva, Liyanage C. .
SPEECH COMMUNICATION, 2007, 49 (02) :98-112
[9]   Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition [J].
Pavez, Eduardo ;
Silva, Jorge F. .
SPEECH COMMUNICATION, 2012, 54 (06) :814-835
[10]  
Rabiner L. R., 1993, Fundamentals of Speech Recognition