Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011

被引:0
作者
Christos-Nikolaos Anagnostopoulos
Theodoros Iliou
Ioannis Giannoukos
机构
[1] University of the Aegean,Cultural Technology and Communication Department
来源
Artificial Intelligence Review | 2015年 / 43卷
关键词
Speech features; Emotion recognition; Classifiers;
D O I
暂无
中图分类号
学科分类号
摘要
Speaker emotion recognition is achieved through processing methods that include isolation of the speech signal and extraction of selected features for the final classification. In terms of acoustics, speech processing techniques offer extremely valuable paralinguistic information derived mainly from prosodic and spectral features. In some cases, the process is assisted by speech recognition systems, which contribute to the classification using linguistic information. Both frameworks deal with a very challenging problem, as emotional states do not have clear-cut boundaries and often differ from person to person. In this article, research papers that investigate emotion recognition from audio channels are surveyed and classified, based mostly on extracted and selected features and their classification methodology. Important topics from different classification techniques, such as databases available for experimentation, appropriate feature extraction and selection methods, classifiers and performance issues are discussed, with emphasis on research published in the last decade. This survey also provides a discussion on open trends, along with directions for future research on this topic.
引用
收藏
页码:155 / 177
页数:22
相关论文
共 96 条
  • [1] Aigner M(2007)Cognitive and emotion recognition deficits in obsessive–compulsive disorder Psychiatr Res 149 121-128
  • [2] Sachs G(2010)Towards emotion recognition from speech: definition, problems and the materials of research Stud Comput Intell 279 127-143
  • [3] Bruckmüller E(2005)ASR for emotional speech: clarifying the issues and enhancing performance Neural Netw 18 437-444
  • [4] Winklbaur B(2003)How to find trouble in communication Speech Commun 40 117-143
  • [5] Zitterl W(2001)Neuropsychology of fear and loathing Nat Rev Neurosci 2 352-363
  • [6] Kryspin-Exner I(2005)Beyond emotion archetypes: databases for emotion modelling using neural networks Neural Netw 18 371-388
  • [7] Gur R(2003)Emotional speech: towards a new generation of databases Speech Commun 40 33-60
  • [8] Katschnig H(2010)The world of emotions is not two dimensional Psychol Sci 18 1050-1057
  • [9] Anagnostopoulos CN(2000)Acoustical properties of speech as indicators of depression and suicidal risk IEEE Trans Biomed Eng 7 829-837
  • [10] Iliou T(2006)Extracting moods from pictures and sounds: towards truly personalized TV IEEE Signal Process Mag 23 90-100