Emotion recognition from speech: a review

被引：183

作者：

Koolagudi, Shashidhar G. ^{[1
]}

Rao, K. Sreenivasa ^{[1
]}

机构：

[1] Indian Inst Technol Kharagpur, Sch Informat Technol, Kharagpur 721302, W Bengal, India

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2012年 / 15卷 / 02期

关键词：

Emotion recognition; Simulated emotional speech corpus; Elicited speech corpus; Natural speech corpus; Excitation source features; System features; Prosodic features; Classification models;

D O I：

10.1007/s10772-011-9125-1

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Emotion recognition from speech has emerged as an important research area in the recent past. In this regard, review of existing work on emotional speech processing is useful for carrying out further research. In this paper, the recent literature on speech emotion recognition has been presented considering the issues related to emotional speech corpora, different types of speech features and models used for recognition of emotions from speech. Thirty two representative speech databases are reviewed in this work from point of view of their language, number of speakers, number of emotions, and purpose of collection. The issues related to emotional speech databases used in emotional speech recognition are also briefly discussed. Literature on different features used in the task of emotion recognition from speech is presented. The importance of choosing different classification models has been discussed along with the review. The important issues to be considered for further emotion recognition research in general and in specific to the Indian context have been highlighted where ever necessary.

引用

页码：99 / 117

页数：19

共 50 条

[21] Evaluating intonational features for emotion recognition from speech
Zervas, Panagiotis
Mporas, Iosif
Fakotakis, Nikos
Kokkinakis, George
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2007, 16 (06) : 1001 - 1014
[22] SUPERVISED DOMAIN ADAPTATION FOR EMOTION RECOGNITION FROM SPEECH
Abdelwahab, Mohammed
Busso, Carlos
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5058 - 5062
[23] Autoencoder With Emotion Embedding for Speech Emotion Recognition
Zhang, Chenghao
Xue, Lei
IEEE ACCESS, 2021, 9 : 51231 - 51241
[24] Anchor Model Fusion for Emotion Recognition in Speech
Ortego-Resa, Carlos
Lopez-Moreno, Ignacio
Ramos, Daniel
Gonzalez-Rodriguez, Joaquin
BIOMETRIC ID MANAGEMENT AND MULTIMODAL COMMUNICATION, PROCEEDINGS, 2009, 5707 : 49 - 56
[25] Learning Alignment for Multimodal Emotion Recognition from Speech
Xu, Haiyang
Zhang, Hui
Han, Kun
Wang, Yun
Peng, Yiping
Li, Xiangang
INTERSPEECH 2019, 2019, : 3569 - 3573
[26] Emotion recognition from the facial image and speech signal
Go, HJ
Kwak, KC
Lee, DJ
Chun, MG
SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 2890 - 2895
[27] A DIMENSIONAL APPROACH TO EMOTION RECOGNITION OF SPEECH FROM MOVIES
Giannakopoulos, Theodoros
Pikrakis, Aggelos
Theodoridis, Sergios
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 65 - 68
[28] Emotion Recognition from Speech: An Unsupervised Learning Approach
Rovetta, Stefano
Mnasri, Zied
Masulli, Francesco
Cabri, Alberto
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 23 - 35
[29] Emotion Recognition in Arabic Speech
Klaylat, Samira
Hamandi, Lama
Osman, Ziad
Zantout, Rached
2017 SENSORS NETWORKS SMART AND EMERGING TECHNOLOGIES (SENSET), 2017,
[30] Emotion recognition in Arabic speech
Samira Klaylat
Ziad Osman
Lama Hamandi
Rached Zantout
Analog Integrated Circuits and Signal Processing, 2018, 96 : 337 - 351

← 1 2 3 4 5 →