Deep Temporal Models using Identity Skip-Connections for Speech Emotion Recognition

被引:24
作者
Kim, Jaebok [1 ]
Englebienne, Gwenn [1 ]
Truong, Khiet P. [1 ]
Evers, Vanessa [1 ]
机构
[1] Univ Twente, Drienerlolaan 5, NL-7522 NB Enschede, Netherlands
来源
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17) | 2017年
基金
欧盟第七框架计划;
关键词
deep learning; speech emotion recognition; residual network; high-way network; REPRESENTATION; FEATURES;
D O I
10.1145/3123266.3123353
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deep architectures using identity skip-connections have demonstrated groundbreaking performance in the field of image classification. Recently, empirical studies suggested that identity skip-connections enable ensemble-like behaviour of shallow networks, and that depth is not a solo ingredient for their success. Therefore, we examine the potential of identity skip-connections for the task of Speech Emotion Recognition (SER) where moderately deep temporal architectures are often employed. To this end, we propose a novel architecture which regulates unimpeded feature flows and captures long-term dependencies via gate-based skip-connections and a memory mechanism. Our proposed architecture is compared to other state-of-the-art methods of SER and is evaluated on large aggregated corpora recorded in different contexts. Our proposed architecture outperforms the state-of-the-art methods by 9 - 15% and achieves an Unweighted Accuracy of 80.5% in an imbalanced class distribution. In addition, we examine a variant adopting simplified skip-connections of Residual Networks (ResNet) and show that gate-based skip-connections are more effective than simplified skip-connections.
引用
收藏
页码:1006 / 1013
页数:8
相关论文
共 38 条
[1]  
[Anonymous], P INTERSPEECH
[2]  
[Anonymous], 2015, P INTERSPEECH
[3]  
[Anonymous], P LREC
[4]  
[Anonymous], 2015, P INT C LEARN REPR
[5]  
[Anonymous], P ACM C MULT
[6]  
[Anonymous], 2016, CoRR
[7]  
[Anonymous], 1997, Neural Computation
[8]  
[Anonymous], P INTERSPEECH
[9]  
[Anonymous], 1995, CONVOLUTIONAL NETWOR
[10]  
[Anonymous], 2020, Nonparametric Statistical Inference, DOI DOI 10.1201/9781439896129