Emotion recognition from speech: a review

被引:183
|
作者
Koolagudi, Shashidhar G. [1 ]
Rao, K. Sreenivasa [1 ]
机构
[1] Indian Inst Technol Kharagpur, Sch Informat Technol, Kharagpur 721302, W Bengal, India
关键词
Emotion recognition; Simulated emotional speech corpus; Elicited speech corpus; Natural speech corpus; Excitation source features; System features; Prosodic features; Classification models;
D O I
10.1007/s10772-011-9125-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Emotion recognition from speech has emerged as an important research area in the recent past. In this regard, review of existing work on emotional speech processing is useful for carrying out further research. In this paper, the recent literature on speech emotion recognition has been presented considering the issues related to emotional speech corpora, different types of speech features and models used for recognition of emotions from speech. Thirty two representative speech databases are reviewed in this work from point of view of their language, number of speakers, number of emotions, and purpose of collection. The issues related to emotional speech databases used in emotional speech recognition are also briefly discussed. Literature on different features used in the task of emotion recognition from speech is presented. The importance of choosing different classification models has been discussed along with the review. The important issues to be considered for further emotion recognition research in general and in specific to the Indian context have been highlighted where ever necessary.
引用
收藏
页码:99 / 117
页数:19
相关论文
共 50 条
  • [21] Evaluating intonational features for emotion recognition from speech
    Zervas, Panagiotis
    Mporas, Iosif
    Fakotakis, Nikos
    Kokkinakis, George
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2007, 16 (06) : 1001 - 1014
  • [22] SUPERVISED DOMAIN ADAPTATION FOR EMOTION RECOGNITION FROM SPEECH
    Abdelwahab, Mohammed
    Busso, Carlos
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5058 - 5062
  • [23] Autoencoder With Emotion Embedding for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    IEEE ACCESS, 2021, 9 : 51231 - 51241
  • [24] Anchor Model Fusion for Emotion Recognition in Speech
    Ortego-Resa, Carlos
    Lopez-Moreno, Ignacio
    Ramos, Daniel
    Gonzalez-Rodriguez, Joaquin
    BIOMETRIC ID MANAGEMENT AND MULTIMODAL COMMUNICATION, PROCEEDINGS, 2009, 5707 : 49 - 56
  • [25] Learning Alignment for Multimodal Emotion Recognition from Speech
    Xu, Haiyang
    Zhang, Hui
    Han, Kun
    Wang, Yun
    Peng, Yiping
    Li, Xiangang
    INTERSPEECH 2019, 2019, : 3569 - 3573
  • [26] Emotion recognition from the facial image and speech signal
    Go, HJ
    Kwak, KC
    Lee, DJ
    Chun, MG
    SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 2890 - 2895
  • [27] A DIMENSIONAL APPROACH TO EMOTION RECOGNITION OF SPEECH FROM MOVIES
    Giannakopoulos, Theodoros
    Pikrakis, Aggelos
    Theodoridis, Sergios
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 65 - 68
  • [28] Emotion Recognition from Speech: An Unsupervised Learning Approach
    Rovetta, Stefano
    Mnasri, Zied
    Masulli, Francesco
    Cabri, Alberto
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 23 - 35
  • [29] Emotion Recognition in Arabic Speech
    Klaylat, Samira
    Hamandi, Lama
    Osman, Ziad
    Zantout, Rached
    2017 SENSORS NETWORKS SMART AND EMERGING TECHNOLOGIES (SENSET), 2017,
  • [30] Emotion recognition in Arabic speech
    Samira Klaylat
    Ziad Osman
    Lama Hamandi
    Rached Zantout
    Analog Integrated Circuits and Signal Processing, 2018, 96 : 337 - 351