Stacking multiple cues for facial action unit detection

被引:5
作者
Akay, Simge [1 ]
Arica, Nafiz [1 ]
机构
[1] Bahcesehir Univ, Istanbul, Turkey
关键词
Facial action unit detection; Deep neural network; Stacking classifiers; Facial expression analysis; EXPRESSION; REPRESENTATION; RECOGNITION; FACE;
D O I
10.1007/s00371-021-02291-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this study, we develop a deep learning-based stacking scheme to detect facial action units (AU) in video data. Given a sequence of video frames, it combines multiple cues extracted from the AU detectors employing in frame, segment, and transition levels. Frame-based detector takes a single frame to determine the existence of AU by employing static face features. Segment-based detector examines various length of subsequences in the neighborhood of a frame to detect whether that frame is an element of an AU segment. Transition-based detector attempts to find the transitions from neutral faces containing no AUs to emotional faces or vice versa, by analyzing fixed size subsequences. The frame subsequences in segment and transition detectors are represented by motion history image, which models the temporal changes in faces. Each detector employs a separate convolutional neural network and, then their results are fed into a meta-classifier to learn the combining method. Combining multiple cues in different levels with a framework containing entirely deep networks improves the detection performance by both locating subtle AUs and tracking small changes in the facial muscles' movements. In performance analysis, it is shown that the proposed approach significantly outperforms the state of the art methods, when compared on CK+, DISFA, and BP4D databases.
引用
收藏
页码:4235 / 4250
页数:16
相关论文
共 46 条
  • [1] [Anonymous], 2016, ARXIV160800911
  • [2] Incremental Face Alignment in the Wild
    Asthana, Akshay
    Zafeiriou, Stefanos
    Cheng, Shiyang
    Pantic, Maja
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1859 - 1866
  • [3] Bihan Jiang, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P314, DOI 10.1109/FG.2011.5771416
  • [4] Ingestion of Lactobacillus strain regulates emotional behavior and central GABA receptor expression in a mouse via the vagus nerve
    Bravo, Javier A.
    Forsythe, Paul
    Chew, Marianne V.
    Escaravage, Emily
    Savignac, Helene M.
    Dinan, Timothy G.
    Bienenstock, John
    Cryan, John F.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (38) : 16050 - 16055
  • [5] Broekens J, 2007, LECT NOTES COMPUT SC, V4451, P113
  • [6] Energy Efficient Dynamic Offloading in Mobile Edge Computing for Internet of Things
    Chen, Ying
    Zhang, Ning
    Zhang, Yongchao
    Chen, Xin
    Wu, Wen
    Shen, Xuemin
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) : 1050 - 1060
  • [7] Deep Structure Inference Network for Facial Action Unit Recognition
    Corneanu, Ciprian
    Madadi, Meysam
    Escalera, Sergio
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 309 - 324
  • [8] Cui Zijun, 2020, Advances in Neural Information Processing Systems, V33
  • [9] A robust spatio-temporal scheme for dynamic 3D facial expression retrieval
    Danelakis, Antonios
    Theoharis, Theoharis
    Pratikakis, Ioannis
    [J]. VISUAL COMPUTER, 2016, 32 (02) : 257 - 269
  • [10] The representation and recognition of human movement using temporal templates
    Davis, JW
    Bobick, AF
    [J]. 1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, : 928 - 934