Speech emotion recognition based on an improved supervised manifold learning algorithm

被引：3

作者：

Zhang S.-Q. ^{[1
,3
]}

Li L.-M. ^{[1
]}

Zhao Z.-J. ^{[2
]}

机构：

[1] School of Communication and Information Engineering, University of Electronic Science and Technology of China

[2] School of Telecommunication, Hangzhou Dianzi University

[3] School of Physics and Electronic Engineering, Taizhou University

来源：

Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology | 2010年 / 32卷 / 11期

关键词：

Manifold learning; Nonlinear dimensionality reduction; Speech emotion recognition; Supervised locally linear embedding;

D O I：

10.3724/SP.J.1146.2009.01430

中图分类号：

学科分类号：

摘要：

To improve effectively the performance on speech emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech feature data lying on a nonlinear manifold embedded in high-dimensional acoustic space. Supervised Locally Linear Embedding (SLLE) is a typical supervised manifold learning algorithm for nonlinear dimensionality reduction. Considering the existing drawbacks of SLLE, this paper proposes an improved version of SLLE, which enhances the discriminating power of low-dimensional embedded data and possesses the optimal generalization ability. The proposed algorithm is used to conduct nonlinear dimensionality reduction for 48-dimensional speech emotional feature data including prosody and voice quality features, and extract low-dimensional embedded discriminating features so as to recognize four emotions including anger, joy, sadness and neutral. Experimental results on the natural speech emotional database demonstrate that the proposed algorithm obtains the highest accuracy of 90.78% with only less 9 embedded features, making 15.65% improvement over SLLE. Therefore, the proposed algorithm can significantly improve speech emotion recognition results when applied for reducing dimensionality of speech emotional feature data.

引用

页码：2724 / 2729

页数：5

共 18 条

[1]

Picard R., Affective Computing, pp. 1-24, (1997)

[2]

Jones C., Deeming A., Affective human-robotic interaction, Affect and Emotion in Human-Computer Interaction, Lecture Notes in Computer Science, 4868, pp. 175-185, (2008)

[3]

Morrison D., Wang R., de Silva L.C., Ensemble methods for spoken emotion recognition in call-centres, Speech Communication, 49, 2, pp. 98-112, (2007)

[4]

Picard R., Robots with emotional intelligence, 4th ACM/IEEE International Conference on Human Robot Interaction, pp. 5-6, (2009)

[5]

Errity A., McKenna J., An investigation of manifold learning for speech analysis, 9th International Conference on Spoken Language Processing (ICSLP'06), pp. 2506-2509, (2006)

[6]

Goddard J., Schlotthauer G., Torres M., Et al., Dimensionality reduction for visualization of normal and pathological speech data, Biomedical Signal Processing and Control, 4, 3, pp. 194-201, (2009)

[7]

Yu D., The application of manifold based visual speech units for visual speech recognition, (2008)

[8]

Roweis S.T., Saul L.K., Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 5500, pp. 2323-2326, (2000)

[9]

Tenenbaum J.B., Silva V., Langford J.C., A global geometric framework for nonlinear dimensionality reduction, Science, 290, 5500, pp. 2319-2323, (2000)

[10]

Jolliffe I.T., Principal Component Analysis, pp. 150-165, (2002)

← 1 2 →