Video expression recognition based on frame-level attention mechanism

被引:0
作者
Chen R. [1 ]
Tong Y. [1 ]
Zhang Y. [2 ]
Xu B. [2 ]
机构
[1] College of Information & Communication Engineering, Nanjing Institute of Technology, Nanjing
[2] Jiangsu Future Network Innovation Research Institute, Nanjing
关键词
attention mechanism; enhanced feature; facial expression recognition (FER); feature extraction; image classification; neural network; VGG network; video sequence;
D O I
10.3772/j.issn.1006-6748.2023.02.003
中图分类号
学科分类号
摘要
Facial expression recognition (FER) in video has attracted the increasing interest and many approaches have been made. The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames. In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed. The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation. The VGG-based network with an enhanced branch embeds face images into feature vectors. The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation. Finally, a regression module outputs the classification results. The experimental results on CK + and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance. © 2023 Inst. of Scientific and Technical Information of China. All rights reserved.
引用
收藏
页码:130 / 139
页数:9
相关论文
共 34 条
[1]  
MENG D B, PENG X J, WANG K, Et al., Frame attention networks for facial expression recognition in videos, Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 3866-3870, (2019)
[2]  
GHIMIRE D, JEONG S, LEE J., Facial expression recognition based on local region specific features and support vector machines, Multimedia Tools and Applications, 76, 6, pp. 7803-7821, (2017)
[3]  
YANG H, CIFTCI U, YIN L., Facial expression recognition by de-expression residue learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2168-2177, (2018)
[4]  
FAN X J, TARDI T., A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences, Pattern Recognition, 48, 11, pp. 3407-3416, (2015)
[5]  
RAHUL M, MAMORIA P, KOHLI N, Et al., An efficient technique for facial expression recognition using multistage hidden Markov model, Soft Computing : Theories and Applications, 742, pp. 33-43, (2019)
[6]  
MIN H, HWA B, XWA B, Et al., Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks, Journal of Visual Communication and Image Representation, 59, pp. 176-185, (2019)
[7]  
GUO C, LIANG J, ZHAN G, Et al., Extended local binary patterns for efficient and robust spontaneous facial micro-expression recognition, IEEE Access, 7, pp. 174517-174530, (2019)
[8]  
WAN M H, LAI Z H, MING Z, Et al., An improve face representation and recognition method based on graph regularized non-negative matrix factorization, Multimedia Tools and Applications, 78, 15, pp. 22109-22126, (2019)
[9]  
GU J, HU H, XIE S., Enhanced dictionary pair learning sparse representation model for facial expression classifica-tion, Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 4467-4471, (2017)
[10]  
GOODFELLOW I J, ERHAN D, CARRIER P L, Et al., Challenges in representation learning : a report on three machine learning contests, Proceedings of International Conference on Neural Information Processing, pp. 117-124, (2013)