ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

被引：0

作者：

Zhang, Ziyang ^{[1
,2
]}

Sun, Xiao ^{[1
,2
,3
]}

An, Liuwei ^{[1
,2
]}

Wang, Meng ^{[1
,2
,3
]}

机构：

[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China

[2] Hefei Univ Technol, Anhui Prov Key Lab Affect Comp & Adv Intelligent, Hefei, Peoples R China

[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V | 2024年 / 14429卷

关键词：

Facial Expression Recognition; Noisy Label Learning; Adaptive Threshold Mining;

D O I：

10.1007/978-981-99-8469-5_23

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Given the similarity between facial expression categories, the presence of compound facial expressions, and the subjectivity of annotators, facial expression recognition (FER) datasets often suffer from ambiguity and noisy labels. Ambiguous expressions are challenging to differentiate from expressions with noisy labels, which hurt the robustness of FER models. Furthermore, the difficulty of recognition varies across different expression categories, rendering a uniform approach unfair for all expressions. In this paper, we introduce a novel approach called Adaptive Sample Mining (ASM) to dynamically address ambiguity and noise within each expression category. First, the Adaptive Threshold Learning module generates two thresholds, namely the clean and noisy thresholds, for each category. These thresholds are based on the mean class probabilities at each training epoch. Next, the Sample Mining module partitions the dataset into three subsets: clean, ambiguity, and noise, by comparing the sample confidence with the clean and noisy thresholds. Finally, the Tri-Regularization module employs a mutual learning strategy for the ambiguity subset to enhance discrimination ability, and an unsupervised learning strategy for the noise subset to mitigate the impact of noisy labels. Extensive experiments prove that our method can effectively mine both ambiguity and noise, and outperform SOTA methods on both synthetic noisy and original datasets. The supplement material is available at https://github.com/zzzzzzyang/ASM.

引用

页码：291 / 302

页数：12

共 29 条

[1] Arpit D, 2017, PR MACH LEARN RES, V70
[2] Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution
Barsoum, Emad
Zhang, Cha
Ferrer, Cristian Canton
Zhang, Zhengyou
[J]. ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 279 - 283
[3] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[4] Facial Expression Recognition in the Wild via Deep Attentive Center Loss
Farzaneh, Amir Hossein
Qi, Xiaojun
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2401 - 2410
[5] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
[6] Guo LZ, 2022, PR MACH LEARN RES
[7] Guo Lan-Zhe, 2022, Advances in Neural Information Processing Systems
[8] MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition
Guo, Yandong
Zhang, Lei
Hu, Yuxiao
He, Xiaodong
Gao, Jianfeng
[J]. COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 87 - 102
[9] Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
Han, Bo
Yao, Quanming
Yu, Xingrui
Niu, Gang
Xu, Miao
Hu, Weihua
Tsang, Ivor W.
Sugiyama, Masashi
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 →