THIN: THrowable Information Networks and Application for Facial Expression Recognition in the Wild

被引:18
作者
Arnaud, Estephe [1 ]
Dapogny, Arnaud [2 ]
Bailly, Kevin [1 ]
机构
[1] Sorbonne Univ Paris, Inst Syst Intelligents & Robot, CNRS, ISIR, F-75005 Paris, France
[2] Datakalab, F-75017 Paris, France
关键词
Facial expression recognition; deep ensemble methods; disentangled representations; ADOLESCENTS; ENSEMBLE;
D O I
10.1109/TAFFC.2022.3144439
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For a number of machine learning problems, an exogenous variable can be identified such that it heavily influences the appearance of the different classes, and an ideal classifier should be invariant to this variable. An example of such exogenous variable is identity if facial expression recognition (FER) is considered. In this paper, we propose a dual exogenous/endogenous representation. The former captures the exogenous variable whereas the second one models the task at hand (e.g., facial expression). We design a prediction layer that uses a tree-gated deep ensemble conditioned by the exogenous representation. We also propose an exogenous dispelling loss to remove the exogenous information from the endogenous representation. Thus, the exogenous information is used two times in a throwable fashion, first as a conditioning variable for the target task, and second to create invariance within the endogenous representation. We call this method THIN, standing for THrowable Information Networks. We experimentally validate THIN in several contexts where an exogenous information can be identified, such as digit recognition under large rotations and shape recognition at multiple scales. We also apply it to FER with identity as the exogenous variable. We demonstrate that THIN significantly outperforms state-of-the-art approaches on several challenging datasets.
引用
收藏
页码:2336 / 2348
页数:13
相关论文
共 53 条
[1]   Covariance Pooling for Facial Expression Recognition [J].
Acharya, Dinesh ;
Huang, Zhiwu ;
Paudel, Danda Pani ;
Van Gool, Luc .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :480-487
[2]  
[Anonymous], 2003, 2003 C COMP VIS PATT, DOI DOI 10.1109/CVPRW.2003.10057
[3]  
[Anonymous], 2015, BMVC
[4]   Tree-Gated Deep Mixture-of-Experts for Pose-Robust Face Alignment [J].
Arnaud E. ;
Dapogny A. ;
Bailly K. .
IEEE Transactions on Biometrics, Behavior, and Identity Science, 2020, 2 (02) :122-132
[5]  
Arnaud E., 2019, P IEEE INT C AUT FAC, P1
[6]  
Assari Mohammad Amin, 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA 2011), P337, DOI 10.1109/ICSIPA.2011.6144162
[7]   Understanding How Adolescents with Autism Respond to Facial Expressions in Virtual Reality Environments [J].
Bekele, Esubalew ;
Zheng, Zhi ;
Swanson, Amy ;
Crittendon, Julie ;
Warren, Zachary ;
Sarkar, Nilanjan .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2013, 19 (04) :711-720
[8]  
Cai J., 2019, Improving person-independent facial expression recognition using deep learning
[9]   Island Loss for Learning Discriminative Features in Facial Expression Recognition [J].
Cai, Jie ;
Meng, Zibo ;
Khan, Ahmed Shehab ;
Li, Zhiyuan ;
O'Reilly, James ;
Tong, Yan .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :302-309
[10]   Augmented reality-based self-facial modeling to promote the emotional expression and social skills of adolescents with autism spectrum disorders [J].
Chen, Chien-Hsu ;
Lee, I-Jui ;
Lin, Ling-Yi .
RESEARCH IN DEVELOPMENTAL DISABILITIES, 2015, 36 :396-403