ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

被引:0
作者
Zhang, Ziyang [1 ,2 ]
Sun, Xiao [1 ,2 ,3 ]
An, Liuwei [1 ,2 ]
Wang, Meng [1 ,2 ,3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Anhui Prov Key Lab Affect Comp & Adv Intelligent, Hefei, Peoples R China
[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V | 2024年 / 14429卷
关键词
Facial Expression Recognition; Noisy Label Learning; Adaptive Threshold Mining;
D O I
10.1007/978-981-99-8469-5_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the similarity between facial expression categories, the presence of compound facial expressions, and the subjectivity of annotators, facial expression recognition (FER) datasets often suffer from ambiguity and noisy labels. Ambiguous expressions are challenging to differentiate from expressions with noisy labels, which hurt the robustness of FER models. Furthermore, the difficulty of recognition varies across different expression categories, rendering a uniform approach unfair for all expressions. In this paper, we introduce a novel approach called Adaptive Sample Mining (ASM) to dynamically address ambiguity and noise within each expression category. First, the Adaptive Threshold Learning module generates two thresholds, namely the clean and noisy thresholds, for each category. These thresholds are based on the mean class probabilities at each training epoch. Next, the Sample Mining module partitions the dataset into three subsets: clean, ambiguity, and noise, by comparing the sample confidence with the clean and noisy thresholds. Finally, the Tri-Regularization module employs a mutual learning strategy for the ambiguity subset to enhance discrimination ability, and an unsupervised learning strategy for the noise subset to mitigate the impact of noisy labels. Extensive experiments prove that our method can effectively mine both ambiguity and noise, and outperform SOTA methods on both synthetic noisy and original datasets. The supplement material is available at https://github.com/zzzzzzyang/ASM.
引用
收藏
页码:291 / 302
页数:12
相关论文
共 29 条
  • [1] Arpit D, 2017, PR MACH LEARN RES, V70
  • [2] Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution
    Barsoum, Emad
    Zhang, Cha
    Ferrer, Cristian Canton
    Zhang, Zhengyou
    [J]. ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 279 - 283
  • [3] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
  • [4] Facial Expression Recognition in the Wild via Deep Attentive Center Loss
    Farzaneh, Amir Hossein
    Qi, Xiaojun
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2401 - 2410
  • [5] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
  • [6] Guo LZ, 2022, PR MACH LEARN RES
  • [7] Guo Lan-Zhe, 2022, Advances in Neural Information Processing Systems
  • [8] MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition
    Guo, Yandong
    Zhang, Lei
    Hu, Yuxiao
    He, Xiaodong
    Gao, Jianfeng
    [J]. COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 87 - 102
  • [9] Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
    Han, Bo
    Yao, Quanming
    Yu, Xingrui
    Niu, Gang
    Xu, Miao
    Hu, Weihua
    Tsang, Ivor W.
    Sugiyama, Masashi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778