Handling Ambiguous Annotations for Facial Expression Recognition in the Wild

被引：0

作者：

Gera, Darshan ^{[1
]}

Vikas, G. N. ^{[2
]}

Balasubramanian, S. ^{[2
]}

机构：

[1] SSSIHL, Brindavan Campus, Bengaluru, Karnataka, India

[2] SSSIHL, Prasanthi Nilayam Campus Anantpur, Anantapur, Andhra Pradesh, India

来源：

PROCEEDINGS OF THE TWELFTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING, ICVGIP 2021 | 2021年

关键词：

Ambiguous annotations; Consistency; Strong augmentation; Weak-augmentation; Facial Expression Recognition; DEEP;

D O I：

10.1145/3490035.3490289

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Annotation ambiguity due to subjectivity of annotators, crowd-sourcing, inter-class similarity and poor quality of facial expression images has been a key challenge towards robust Facial Expression Recognition (FER). Recent deep learning (DL) solutions for this problem select clean samples for training by using two or more networks simultaneously. Based on the observation that wrongly annotated samples have inconsistent predictions compared to clean samples when transformed using different augmentations, we propose a simple and effective single network FER framework robust to noisy annotations. Specifically, we qualify an image to be clean (correctly labeled) if the Jenson-Shannon (JS) divergence between its ground truth distribution and the predicted distribution for its weak augmented version is smaller than a threshold. The threshold is dynamically tuned. The qualified clean samples facilitate supervision during training. Further, to learn hard samples (correctly labeled but difficult to classify), we enforce consistency between the predicted distributions of weak and strong augmented versions of every training image through a consistency loss. Comprehensive experiments on FER datasets like RAFDB, FERPlus, curated FEC and AffectNet in the presence of both synthetic and real noisy annotation settings demonstrate the robustness of the proposed method. The source codes are publicly available at https://github.com/1980x/HandlingAmbigiousFERAnnotations.

引用

页数：9

共 48 条

[21] Fast and Efficient Facial Expression Recognition Using a Gabor Convolutional Network [J].

Jiang, Ping ;

Wan, Bo ;

Wang, Quan ;

Wu, Jiang .

IEEE SIGNAL PROCESSING LETTERS, 2020, 27 :1954-1958

[22]

Kollias D, 2019, Arxiv, DOI arXiv:1811.05027

[23]

Laine S, 2017, Arxiv, DOI arXiv:1610.02242

[24] Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition [J].

Li, Shan ;

Deng, Weihong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) :356-370

[25]

Li Shan, 2017, Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild

[26] Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism [J].

Li, Yong ;

Zeng, Jiabei ;

Shan, Shiguang ;

Chen, Xilin .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (05) :2439-2450

[27]

Liu P, 2021, Arxiv, DOI arXiv:2008.11401

[28]

Mahmoudi MA, 2020, IEEE IMAGE PROC, P2226, DOI 10.1109/ICIP40778.2020.9190694

[29] AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild [J].

Mollahosseini, Ali ;

Hasani, Behzad ;

Mahoor, Mohammad H. .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (01) :18-31

[30]

Patrini G, 2017, Arxiv, DOI arXiv:1609.03683

← 1 2 3 4 5 →