Handling Ambiguous Annotations for Facial Expression Recognition in the Wild

被引:0
作者
Gera, Darshan [1 ]
Vikas, G. N. [2 ]
Balasubramanian, S. [2 ]
机构
[1] SSSIHL, Brindavan Campus, Bengaluru, Karnataka, India
[2] SSSIHL, Prasanthi Nilayam Campus Anantpur, Anantapur, Andhra Pradesh, India
来源
PROCEEDINGS OF THE TWELFTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING, ICVGIP 2021 | 2021年
关键词
Ambiguous annotations; Consistency; Strong augmentation; Weak-augmentation; Facial Expression Recognition; DEEP;
D O I
10.1145/3490035.3490289
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Annotation ambiguity due to subjectivity of annotators, crowd-sourcing, inter-class similarity and poor quality of facial expression images has been a key challenge towards robust Facial Expression Recognition (FER). Recent deep learning (DL) solutions for this problem select clean samples for training by using two or more networks simultaneously. Based on the observation that wrongly annotated samples have inconsistent predictions compared to clean samples when transformed using different augmentations, we propose a simple and effective single network FER framework robust to noisy annotations. Specifically, we qualify an image to be clean (correctly labeled) if the Jenson-Shannon (JS) divergence between its ground truth distribution and the predicted distribution for its weak augmented version is smaller than a threshold. The threshold is dynamically tuned. The qualified clean samples facilitate supervision during training. Further, to learn hard samples (correctly labeled but difficult to classify), we enforce consistency between the predicted distributions of weak and strong augmented versions of every training image through a consistency loss. Comprehensive experiments on FER datasets like RAFDB, FERPlus, curated FEC and AffectNet in the presence of both synthetic and real noisy annotation settings demonstrate the robustness of the proposed method. The source codes are publicly available at https://github.com/1980x/HandlingAmbigiousFERAnnotations.
引用
收藏
页数:9
相关论文
共 48 条
[1]   Covariance Pooling for Facial Expression Recognition [J].
Acharya, Dinesh ;
Huang, Zhiwu ;
Paudel, Danda Pani ;
Van Gool, Luc .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :480-487
[2]  
Arnaud E, 2022, Arxiv, DOI arXiv:2010.07614
[3]  
Arpit D, 2017, PR MACH LEARN RES, V70
[4]   Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution [J].
Barsoum, Emad ;
Zhang, Cha ;
Ferrer, Cristian Canton ;
Zhang, Zhengyou .
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :279-283
[5]  
Chen Yuedong, 2019, VCIP
[6]   Randaugment: Practical automated data augmentation with a reduced search space [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Shlens, Jonathon ;
Le, Quoc, V .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :3008-3017
[7]   AutoAugment: Learning Augmentation Strategies from Data [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Mane, Dandelion ;
Vasudevan, Vijay ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123
[8]   Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition [J].
Ding, Hui ;
Zhou, Peng ;
Chellappa, Rama .
IEEE/IAPR INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2020), 2020,
[9]  
Fan XY, 2020, IEEE IMAGE PROC, P903, DOI [10.1109/icip40778.2020.9190643, 10.1109/ICIP40778.2020.9190643]
[10]   Semantic Neighborhood-Aware Deep Facial Expression Recognition [J].
Fu, Yongjian ;
Wu, Xintian ;
Li, Xi ;
Pan, Zhijie ;
Luo, Daxin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) :6535-6548