Facial Expression Recognition in the Wild: A Cycle-Consistent Adversarial Attention Transfer Approach

被引:14
作者
Zhang, Feifei [1 ,2 ]
Zhang, Tianzhu [2 ,3 ]
Mao, Qirong [1 ]
Duan, Lingyu [4 ]
Xu, Changsheng [2 ,3 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang, Jiangsu, Peoples R China
[2] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Peking Univ, Inst Digital Media, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18) | 2018年
基金
北京市自然科学基金;
关键词
facial expression recognition; domain adaptation; attention transfer; generative adversarial networks; emotional cue extraction; GAUSSIAN-PROCESSES; MULTIVIEW; POSE;
D O I
10.1145/3240508.3240574
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Facial expression recognition (FER) is a very challenging problem due to different expressions under arbitrary poses. Most conventional approaches mainly perform FER under laboratory controlled environment. Different from existing methods, in this paper, we formulate the FER in the wild as a domain adaptation problem, and propose a novel auxiliary domain guided Cycle-consistent adversarial Attention Transfer model (CycleAT) for simultaneous facial image synthesis and facial expression recognition in the wild. The proposed model utilizes large-scale unlabeled web facial images as an auxiliary domain to reduce the gap between source domain and target domain based on generative adversarial networks (GAN) embedded with an effective attention transfer module, which enjoys several merits. First, the GAN-based method can automatically generate labeled facial images in the wild through harnessing information from labeled facial images in source domain and unlabeled web facial images in auxiliary domain. Second, the class-discriminative spatial attention maps from the classifier in source domain are leveraged to boost the performance of the classifier in target domain. Third, it can effectively preserve the structural consistency of local pixels and global attributes in the synthesized facial images through pixel cycle-consistency and discriminative loss. Quantitative and qualitative evaluations on two challenging in-the-wild datasets demonstrate that the proposed model performs favorably against state-of-the-art methods.
引用
收藏
页码:126 / 135
页数:10
相关论文
共 66 条
[1]  
Abadi M., 2015, PREPRINT
[2]   Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications [J].
Adrian Corneanu, Ciprian ;
Oliu Simon, Marc ;
Cohn, Jeffrey F. ;
Escalera Guerrero, Sergio .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) :1548-1568
[3]  
[Anonymous], 2017, P IEEE C COMP VIS PA
[4]  
[Anonymous], COMPUTER VISION PATT
[5]  
[Anonymous], UNSUPERVISED DOMAIN
[6]  
[Anonymous], COMPUTER VISION PATT
[7]  
[Anonymous], 2017, COMPUTER VISION PATT
[8]  
[Anonymous], LABELED FACES WILD D
[9]  
[Anonymous], COMPUTER VISION PATT
[10]  
[Anonymous], PROC CVPR IEEE