Speaker characteristics and emotion classification

被引:10
作者
Mustererkennung, Universität Erlangen-Nürnberg, Martensstr. 3, 91058 Erlangen, Germany [1 ]
不详 [2 ]
机构
[1] Mustererkennung, Universität Erlangen-Nürnberg, 91058 Erlangen
[2] Sympalog Voice Solutions GmbH, 91052 Erlangen
来源
Lect. Notes Comput. Sci. | 2007年 / 138-151期
关键词
Acoustic features; Automatic classification; Emotion; Laryngealization; Speaker dependency; System architecture; Voice application;
D O I
10.1007/978-3-540-74200-5_7
中图分类号
学科分类号
摘要
In this paper, we address the -interrelated -problems of speaker characteristics (personalization) and suboptimal performance of emotion classification in state-of-the-art modules from two different points of view: first, we focus on a specific phenomenon (irregular phonation or laryngealization) and argue that its inherent multi-functionality and speaker-dependency makes its use as feature in emotion classification less promising than one might expect. Second, we focus on a specific application of emotion recognition in a voice portal and argue that constraints on time and budget often prevent the implementation of an optimal emotion recognition module. © Springer-Verlag Berlin Heidelberg 2007.
引用
收藏
页码:138 / 151
页数:13
相关论文
共 30 条
  • [1] Cowie R., Cornelius R., Describing the emotional states that are expressed in speech, Speech Communication, 40, pp. 5-32, (2003)
  • [2] Schuller B., Muller R., Lang M., Rigoll G., Speaker Independent Emotion Recognition by Early Fusion of Acoustic and Linguistic Features within Ensembles, Proc. 9th Eurospeech - Interspeech, pp. 805-808, (2005)
  • [3] Labov W., The Study of Language in its Social Context, Studium Generale, 3, pp. 30-87, (1970)
  • [4] Batliner A., Steidl S., Schuller B., Seppi D., Laskowski K., Vogt T., Devillers L., Vidrascu L., Amir N., Kessous L., Aharonson V., Combining Efforts for Improving Automatic Classification of Emotional User States, Proceedings of IS-LTC, pp. 240-245, (2006)
  • [5] Batliner A., Steidl S., Hacker C., Noth E., Niemann H., Tales of Tuning -Prototyping for Automatic Classification of Emotional User States, Proc. 9th Eurospeech - Interspeech, pp. 489-492, (2005)
  • [6] Scherer K., Vocal communication of emotion: A review of research paradigms, Speech Communication, 40, pp. 227-256, (2003)
  • [7] Poggi L., Pelachaud C., de Carolis B., To Display or Not To Display? Towards the Architecture of a Reflexive Agent, Proceedings of the 2nd Workshop on Attitude, Personality and Emotions in User-adapted Interaction, User Modeling, (2001)
  • [8] Local J., Kelly J., Projection and 'silences': Notes on phonetic and conversational structure, Human Studies, 9, pp. 185-204, (1986)
  • [9] Kushan S., Slifka J., Is irregular phonation a reliable cue towards the segmentation of continuous speech in American English?, Proc. of Speech Prosody, pp. 795-798, (2006)
  • [10] Ni Chasaide A., Gobi C., Voice Quality and f<sub>0</sub> in Prosody: Towards a Holistic Account, Proc. of Speech Prosody, (2004)