Visual Scene-aware Hybrid Neural Network Architecture for Video-based Facial Expression Recognition

被引：6

作者：

Lee, Min Kyu ^{[1
]}

Choi, Dong Yoon ^{[1
]}

Kim, Dae Ha ^{[1
]}

Song, Byung Cheol ^{[1
]}

机构：

[1] Inha Univ, Dept Elect Engn, Incheon, South Korea

来源：

2019 14TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2019) | 2019年

关键词：

D O I：

10.1109/fg.2019.8756551

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With rapid development of deep learning, facial expression recognition ( FER) technology has made considerable progress recently. However, since conventional FER techniques are mainly designed and learned for videos which are artificially acquired in a limited environment, they may not operate robustly on videos acquired in a wild environment. To solve this problem, this paper proposes a scene-aware hybrid neural network ( NN) having a novel combination of three-dimensional ( 3D) convolutional NN ( CNN), 2D CNN and recurrent NN ( RNN). The characteristics of the proposed network are as follows. First, we extract video-based global features and frame-based local features at the same time. In detail, the latent features containing the overall visual scene of a given video are extracted by 3D CNN with auxiliary classifier, and fine-tuned 2D CNN is adopted to extract latent features containing small details from each frame. Second, RNN not only performs temporal domain learning, but also feature-wise fuses two latent features extracted from the networks. For effective fusion, we also present three RNN schemes. Third, the proposed network, in which the above-mentioned methods collaborate, works very robust in a wild environment as well as in a limited environment. Extensive experiments show that the proposed network provides an average accuracy of 49.9% for AFEW dataset, i. e., a representative wild dataset, and an amazing accuracy of 98.2% for another CK+ dataset. We also show that the proposed network outperforms the state-of-the-art network(s).

引用

页码：153 / 160

页数：8

共 45 条

[1] Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications [J].

Adrian Corneanu, Ciprian ;

Oliu Simon, Marc ;

Cohn, Jeffrey F. ;

Escalera Guerrero, Sergio .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) :1548-1568

[2]

[Anonymous], 2015, ICML

[3]

Baltrusaitis T, 2015, IEEE INT CONF AUTOMA

[4]

Bihan Jiang, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P314, DOI 10.1109/FG.2011.5771416

[5] Learning person-specific models for facial expression and action unit recognition [J].