Noisy speech emotion recognition using sample reconstruction and multiple-kernel learning

被引:6
作者
Xiaoqing J. [1 ,2 ]
Kewen X. [1 ]
Yongliang L. [1 ,3 ]
Jianchuan B. [1 ]
机构
[1] School of Electronic and Information Engineering, Hebei University of Technology, Tianjin
[2] School of Information Science and Engineering, University of Jinan, Jinan
[3] Information Center, Tianjin Chengjian University, Tianjin
来源
Journal of China Universities of Posts and Telecommunications | 2017年 / 24卷 / 02期
关键词
compressed sensing; feature selection; multiple-kernel learning; speech emotion recognition;
D O I
10.1016/S1005-8885(17)60193-6
中图分类号
学科分类号
摘要
Speech emotion recognition (SER) in noisy environment is a vital issue in artificial intelligence (AI). In this paper, the reconstruction of speech samples removes the added noise. Acoustic features extracted from the reconstructed samples are selected to build an optimal feature subset with better emotional recognizability. A multiple-kernel (MK) support vector machine (SVM) classifier solved by semi-definite programming (SDP) is adopted in SER procedure. The proposed method in this paper is demonstrated on Berlin Database of Emotional Speech. Recognition accuracies of the original, noisy, and reconstructed samples classified by both single-kernel (SK) and MK classifiers are compared and analyzed. The experimental results show that the proposed method is effective and robust when noise exists. © 2017 The Journal of China Universities of Posts and Telecommunications
引用
收藏
页码:1,17 / 9
相关论文
共 31 条
[31]  
Henriquez P., Alonso J.B., Ferrer M.A., Et al., Nonlinear dynamics characterization of emotional speech, Neurocomputing, 132, pp. 126-135, (2014)