Dimensionality reduction-based spoken emotion recognition

被引:0
作者
Shiqing Zhang
Xiaoming Zhao
机构
[1] Taizhou University,School of Physics and Electronic Engineering
[2] Taizhou University,Department of Computer Science
来源
Multimedia Tools and Applications | 2013年 / 63卷
关键词
Emotion recognition; Dimensionality reduction; Manifold learning;
D O I
暂无
中图分类号
学科分类号
摘要
To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised manifold learning algorithm for nonlinear dimensionality reduction, called modified supervised locally linear embedding algorithm (MSLLE) is proposed for spoken emotion recognition. MSLLE aims at enlarging the interclass distance while shrinking the intraclass distance in an effort to promote the discriminating power and generalization ability of low-dimensional embedded data representations. To compare the performance of MSLLE, not only three unsupervised dimensionality reduction methods, i.e., principal component analysis (PCA), locally linear embedding (LLE) and isometric mapping (Isomap), but also five supervised dimensionality reduction methods, i.e., linear discriminant analysis (LDA), supervised locally linear embedding (SLLE), local Fisher discriminant analysis (LFDA), neighborhood component analysis (NCA) and maximally collapsing metric learning (MCML), are used to perform dimensionality reduction on spoken emotion recognition tasks. Experimental results on two emotional speech databases, i.e. the spontaneous Chinese database and the acted Berlin database, confirm the validity and promising performance of the proposed method.
引用
收藏
页码:615 / 646
页数:31
相关论文
共 103 条
  • [1] Banse R(1996)Acoustic profiles in vocal emotion expression J Pers Soc Psychol 70 614-636
  • [2] Scherer KR(2011)Whodunnit–searching for the most important feature types signalling emotion-related user states in speech Comput Speech Lang 25 4-28
  • [3] Batliner A(1996)Assessing agreement on classification tasks: the kappa statistic Comput Ling 22 249-254
  • [4] Steidl S(2006)Manifold based analysis of facial expression Image Vis Comput 24 605-614
  • [5] Schuller B(2003)Describing the emotional states that are expressed in speech Speech Comm 40 5-32
  • [6] Seppi D(2001)Emotion recognition in human-computer interaction IEEE Signal Process Mag 18 32-80
  • [7] Vogt T(2010)Regularization parameter choice in locally linear embedding Neurocomputing 73 1595-1605
  • [8] Wagner J(1992)An argument for basic emotions Cognit Emot 6 169-200
  • [9] Devillers L(2003)Modeling drivers’ speech under stress Speech Comm 40 145-159
  • [10] Vidrascu L(1936)The use of multiple measures in taxonomic problems Ann Eugenics 7 179-188