Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition

被引:20
作者
Nie, Weizhi [1 ]
Yan, Yan [1 ]
Song, Dan [1 ]
Wang, Kun [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; Feature fusion; LSTM; Multi-modal;
D O I
10.1007/s11042-020-08796-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotion is a key element in video data. However, it is difficult to understand the emotions conveyed in such videos due to the sparsity of video frames expressing emotion. Meanwhile, some approaches proposed to consider utterances as independent entities and ignore the inter-dependencies and relations among the utterances in recent years. These approaches also ignore the key point of multi-modal feature fusion in the feature learning process. In order to handle this problem, in this paper, we propose an LSTM-based model that can fully consider the relations among the utterances and also handle the multi-modal feature fusion problem in the learning process. Finally, the experiments on some popular datasets demonstrate the effectiveness of our approach.
引用
收藏
页码:16205 / 16214
页数:10
相关论文
共 23 条
[1]  
[Anonymous], 1970, California Mental Health Res. Dig., DOI DOI 10.1016/J.SOC.2010.04.003
[2]  
[Anonymous], 2016, IEEE T AFFECTIVE COM
[3]  
[Anonymous], 2016, ACM MM 2016, DOI DOI 10.1145/2964284.2967196
[4]   Emotion Recognition In The Wild Challenge 2013 [J].
Dhall, Abhinav ;
Goecke, Roland ;
Joshi, Jyoti ;
Wagner, Michael ;
Gedeon, Tom .
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, :509-515
[5]   On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues [J].
Eyben, Florian ;
Woellmer, Martin ;
Graves, Alex ;
Schuller, Bjoern ;
Douglas-Cowie, Ellen ;
Cowie, Roddy .
JOURNAL ON MULTIMODAL USER INTERFACES, 2010, 3 (1-2) :7-19
[6]   Application of Cell-Aware Test on an Advanced 3nm CMOS Technology Library [J].
Gao, Zhan ;
Hu, Min-Chun ;
Baert, Rogier ;
Chehab, Bilal ;
Malagi, Santosh ;
Swenton, Joe ;
Huisken, Jos ;
Goossens, Kees ;
Marinissen, Erik Jan .
2019 IEEE INTERNATIONAL TEST CONFERENCE (ITC), 2019,
[7]   3D Convolutional Neural Networks for Human Action Recognition [J].
Ji, Shuiwang ;
Xu, Wei ;
Yang, Ming ;
Yu, Kai .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) :221-231
[8]   EmoNets: Multimodal deep learning approaches for emotion recognition in video [J].
Kahou, Samira Ebrahimi ;
Bouthillier, Xavier ;
Lamblin, Pascal ;
Gulcehre, Caglar ;
Michalski, Vincent ;
Konda, Kishore ;
Jean, Sebastien ;
Froumenty, Pierre ;
Dauphin, Yann ;
Boulanger-Lewandowski, Nicolas ;
Ferrari, Raul Chandias ;
Mirza, Mehdi ;
Warde-Farley, David ;
Courville, Aaron ;
Vincent, Pascal ;
Memisevic, Roland ;
Pal, Christopher ;
Bengio, Yoshua .
JOURNAL ON MULTIMODAL USER INTERFACES, 2016, 10 (02) :99-111
[9]   Audio-Visual Emotion Recognition using Gaussian Mixture Models for Face and Voice [J].
Metallinou, Angeliki ;
Lee, Sungbok ;
Narayanan, Shrikanth .
ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, :250-257
[10]  
Mikolov T., 2013, NeurIPS, P1