Learning Expression Features via Deep Residual Attention Networks for Facial Expression Recognition From Video Sequences

被引：9

作者：

Zhao, Xiaoming ^{[1
]}

Chen, Gang ^{[1
,2
]}

Chuang, Yuelong ^{[1
]}

Tao, Xin ^{[1
]}

Zhang, Shiqing ^{[1
]}

机构：

[1] Taizhou Univ, Inst Intelligent Information Proc, Taizhou 318000, Zhejiang, Peoples R China

[2] Zhejiang Sci Tech Univ, Sch Fac Mech Engn & Automat, Hangzhou 310018, Zhejiang, Peoples R China

来源：

IETE TECHNICAL REVIEW | 2021年 / 38卷 / 06期

基金：

中国国家自然科学基金; 美国国家科学基金会;

关键词：

Attention mechanism; Deep residual attention networks; Facial expression recognition; Multi-layer perceptron; Video sequences; FACE; FUSION; VECTOR;

D O I：

10.1080/02564602.2020.1814168

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Facial expression recognition from video sequences is currently an interesting research topic in computer vision, pattern recognition, artificial intelligence,etc. Considering the problem of semantic gap between the extracted hand-designed features in affective videos and subjective emotions, recognizing facial expressions from video sequences is a challenging subject. To tackle this problem, this paper proposes a new method of facial expression recognition from video sequences via deep residual attention network. Firstly, due to the difference in the intensity of emotional representation of each local area in a facial image, deep residual attention networks are employed to learn high-level affective expression features for each frame of facial expression images in video sequences. The used deep residual attention networks integrate deep residual networks with a spatial attention mechanism. Then, average-pooling is performed to produce fixed-length global video-level feature representations. Finally, the global video-level feature representations are utilized as inputs of a multi-layer perceptron to conduct facial expression classification tasks in video sequences. Experimental results on two public video emotional datasets,i.e.BAUM-1s and RML, demonstrate the effectiveness of the proposed method.

引用

页码：602 / 610

页数：9

共 37 条

[1] Audiovisual emotion recognition in wild [J].

Avots, Egils ;

Sapinski, Tomasz ;

Bachmann, Maie ;

Kaminska, Dorota .

MACHINE VISION AND APPLICATIONS, 2019, 30 (05) :975-985

[2]

Çalik RC, 2018, I C COMP SYST APPLIC

[3]

Campadelli P, 2005, LECT NOTES COMPUT SC, V3617, P1002, DOI 10.1007/11553595_123

[4] Learning Spatiotemporal Features with 3D Convolutional Networks [J].

Du Tran ;

Bourdev, Lubomir ;

Fergus, Rob ;

Torresani, Lorenzo ;

Paluri, Manohar .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497

[5]

Elmadany NE, 2016, IEEE INT SYMP CIRC S, P590, DOI 10.1109/ISCAS.2016.7527309

[6] Facial Expression Recognition Using a Temporal Ensemble of Multi-Level Convolutional Neural Networks [J].

Hai-Duong Nguyen ;

Kim, Sun-Hee ;

Lee, Guee-Sang ;

Yang, Hyung-Jeong ;

Na, In-Seop ;

Kim, Soo-Hyung .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (01) :226-237

[7]

He, P IEEE C COMP VIS PA, P770, DOI DOI 10.1109/CVPR.2016.90

[8] Spontaneous facial micro-expression analysis using Spatiotemporal Completed Local Quantized Patterns [J].

Huang, Xiaohua ;

Zhao, Guoying ;

Hong, Xiaopeng ;

Zheng, Wenming ;

Pietikainen, Matti .

NEUROCOMPUTING, 2016, 175 :564-578

[9] Spatiotemporal feature extraction for facial expression recognition [J].

Kamarol, Siti Khairuni Amalina ;

Jaward, Mohamed Hisham ;

Parkkinen, Jussi ;

Parthiban, Rajendran .

IET IMAGE PROCESSING, 2016, 10 (07) :534-541

[10] An Active Learning Paradigm for Online Audio-Visual Emotion Recognition [J].

Kansizoglou, Ioannis ;

Bampis, Loukas ;

Gasteratos, Antonios .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) :756-768

← 1 2 3 4 →