Learning Expression Features via Deep Residual Attention Networks for Facial Expression Recognition From Video Sequences

被引:9
作者
Zhao, Xiaoming [1 ]
Chen, Gang [1 ,2 ]
Chuang, Yuelong [1 ]
Tao, Xin [1 ]
Zhang, Shiqing [1 ]
机构
[1] Taizhou Univ, Inst Intelligent Information Proc, Taizhou 318000, Zhejiang, Peoples R China
[2] Zhejiang Sci Tech Univ, Sch Fac Mech Engn & Automat, Hangzhou 310018, Zhejiang, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Attention mechanism; Deep residual attention networks; Facial expression recognition; Multi-layer perceptron; Video sequences; FACE; FUSION; VECTOR;
D O I
10.1080/02564602.2020.1814168
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Facial expression recognition from video sequences is currently an interesting research topic in computer vision, pattern recognition, artificial intelligence,etc. Considering the problem of semantic gap between the extracted hand-designed features in affective videos and subjective emotions, recognizing facial expressions from video sequences is a challenging subject. To tackle this problem, this paper proposes a new method of facial expression recognition from video sequences via deep residual attention network. Firstly, due to the difference in the intensity of emotional representation of each local area in a facial image, deep residual attention networks are employed to learn high-level affective expression features for each frame of facial expression images in video sequences. The used deep residual attention networks integrate deep residual networks with a spatial attention mechanism. Then, average-pooling is performed to produce fixed-length global video-level feature representations. Finally, the global video-level feature representations are utilized as inputs of a multi-layer perceptron to conduct facial expression classification tasks in video sequences. Experimental results on two public video emotional datasets,i.e.BAUM-1s and RML, demonstrate the effectiveness of the proposed method.
引用
收藏
页码:602 / 610
页数:9
相关论文
共 37 条
[1]   Audiovisual emotion recognition in wild [J].
Avots, Egils ;
Sapinski, Tomasz ;
Bachmann, Maie ;
Kaminska, Dorota .
MACHINE VISION AND APPLICATIONS, 2019, 30 (05) :975-985
[2]  
Çalik RC, 2018, I C COMP SYST APPLIC
[3]  
Campadelli P, 2005, LECT NOTES COMPUT SC, V3617, P1002, DOI 10.1007/11553595_123
[4]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[5]  
Elmadany NE, 2016, IEEE INT SYMP CIRC S, P590, DOI 10.1109/ISCAS.2016.7527309
[6]   Facial Expression Recognition Using a Temporal Ensemble of Multi-Level Convolutional Neural Networks [J].
Hai-Duong Nguyen ;
Kim, Sun-Hee ;
Lee, Guee-Sang ;
Yang, Hyung-Jeong ;
Na, In-Seop ;
Kim, Soo-Hyung .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (01) :226-237
[7]  
He, P IEEE C COMP VIS PA, P770, DOI DOI 10.1109/CVPR.2016.90
[8]   Spontaneous facial micro-expression analysis using Spatiotemporal Completed Local Quantized Patterns [J].
Huang, Xiaohua ;
Zhao, Guoying ;
Hong, Xiaopeng ;
Zheng, Wenming ;
Pietikainen, Matti .
NEUROCOMPUTING, 2016, 175 :564-578
[9]   Spatiotemporal feature extraction for facial expression recognition [J].
Kamarol, Siti Khairuni Amalina ;
Jaward, Mohamed Hisham ;
Parkkinen, Jussi ;
Parthiban, Rajendran .
IET IMAGE PROCESSING, 2016, 10 (07) :534-541
[10]   An Active Learning Paradigm for Online Audio-Visual Emotion Recognition [J].
Kansizoglou, Ioannis ;
Bampis, Loukas ;
Gasteratos, Antonios .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) :756-768