Emotion recognition using acoustic features and textual content

被引:35
作者
Chuang, ZJ [1 ]
Wu, CH [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
来源
2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3 | 2004年
关键词
D O I
10.1109/ICME.2004.1394123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an approach to emotion recognition from speech signals and textual content. In the analysis of speech signals, thirty-three acoustic features are extracted from the speech input. After Principle Component Analysis (PCA), 14 principle components are selected for discriminative representation. In this representation each principle component is the combination of the 33 original acoustic features and forms a feature subspace. The Support Vector Machines (SVMs) are adopted to classify the emotional states. In text analysis, all emotional keywords and emotion modification words are manually defined. The emotion intensity levels of emotional keywords and emotion modification words are estimated from a collected emotion corpus. The final emotional state is determined based on the emotion outputs from the acoustic and textual approaches. The experimental result shows that the emotion recognition accuracy of the integrated system is better than each of the two individual approaches.
引用
收藏
页码:53 / 56
页数:4
相关论文
共 6 条
  • [1] CHUANG ZJ, 2002, P IEEE INT C SPOK LA
  • [2] COHN JF, 1998, ACM INT MULT C SEPT
  • [3] De Silva L. C., 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), P332, DOI 10.1109/AFGR.2000.840655
  • [4] FUKUDA S, 1999, IEEE INT WORKSH SYST, V4, P299
  • [5] Reeves B., 1996, The media equation: How people treat computers, television, and new media like real people and places
  • [6] YU F, 2001, IEEE PAC RIM C MULT, P550