Evaluating deep learning architectures for Speech Emotion Recognition

被引：356

作者：

Fayek, Haytham M. ^{[1
]}

Lech, Margaret ^{[1
]}

Cavedon, Lawrence ^{[2
]}

机构：

[1] RMIT Univ, Sch Engn, Melbourne, Vic 3001, Australia

[2] RMIT Univ, Sch Sci, Melbourne, Vic 3001, Australia

来源：

NEURAL NETWORKS | 2017年 / 92卷

关键词：

Affective computing; Deep learning; Emotion recognition; Neural networks; Speech recognition; NEURAL-NETWORKS; FEATURES;

D O I：

10.1016/j.neunet.2017.02.013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech Emotion Recognition (SER) can be regarded as a static or dynamic classification problem, which makes SER an excellent test bed for investigating and comparing various deep learning architectures. We describe a frame-based formulation to SER that relies on minimal speech processing and end-to-end deep learning to model intra-utterance dynamics. We use the proposed SER system to empirically explore feed-forward and recurrent neural network architectures and their variants. Experiments conducted illuminate the advantages and limitations of these architectures in paralinguistic speech recognition and emotion recognition in particular. As a result of our exploration, we report state-of-the-art results on the IEMOCAP database for speaker-independent SER and present quantitative and qualitative assessments of the models' performances. (C) 2017 Elsevier Ltd. All rights reserved.

引用

页码：60 / 68

页数：9

共 49 条

[1] Convolutional Neural Networks for Speech Recognition [J].

Abdel-Hamid, Ossama ;

Mohamed, Abdel-Rahman ;

Jiang, Hui ;

Deng, Li ;

Penn, Gerald ;

Yu, Dong .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545

[2]

[Anonymous], 2008, THESIS

[3]

[Anonymous], 2004, THESIS

[4]

[Anonymous], PROC CVPR IEEE

[5]

[Anonymous], 2015, ARXIV PREPRINT ARXIV

[6]

[Anonymous], 2021, NEURAL NETW MACH

[7]

[Anonymous], 1997, Neural Computation

[8]

[Anonymous], 2016, DEEP LEARNING

[9]

[Anonymous], 2014, INTERSPEECH 2014

[10]

[Anonymous], 7 WORKSH DISFL SPONT

← 1 2 3 4 5 →