UA-FER: Uncertainty-aware representation learning for facial expression recognition

被引：1

作者：

Zhou, Haoliang ^{[1
]}

Huang, Shucheng ^{[1
]}

Xu, Yuqiao ^{[2
]}

机构：

[1] Jiangsu Univ Sci & Technol, Sch Comp, Zhenjiang 212003, Peoples R China

[2] Tianjin Univ Technol, Sch Comp Sci & Engn, Tianjin 300384, Peoples R China

来源：

NEUROCOMPUTING | 2025年 / 621卷

基金：

中国国家自然科学基金;

关键词：

Facial expression recognition; Uncertainty-aware representation learning; Evidential deep learning; Vision-language pre-training model; Knowledge distillation; FEATURES;

D O I：

10.1016/j.neucom.2024.129261

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial Expression Recognition (FER) remains a challenging task due to unconstrained conditions like variations in illumination, pose, and occlusion. Current FER approaches mainly focus on learning discriminative features through local attention and global perception of visual encoders, while neglecting the rich semantic information in the text modality. Additionally, these methods rely solely on the softmax-based activation layer for predictions, resulting in overconfident decision-making that hampers the effective handling of uncertain samples and relationships. Such insufficient representations and overconfident predictions degrade recognition performance, particularly in unconstrained scenarios. To tackle these issues, we propose an end-to-end FER framework called UA-FER, which integrates vision-language pre-training (VLP) models with evidential deep learning (EDL) theory to enhance recognition accuracy and robustness. Specifically, to identify multi-grained discriminative regions, we propose the Multi-granularity Feature Decoupling (MFD) module, which decouples global and local facial representations based on image-text affinity while distilling the universal knowledge from the pre-trained VLP models. Additionally, to mitigate misjudgments in uncertain visual-textual relationships, we introduce the Relation Uncertainty Calibration (RUC) module, which corrects these uncertainties using EDL theory. In this way, the model enhances its ability to capture emotion-related discriminative representations and tackle uncertain relationships, thereby improving overall recognition accuracy and robustness. Extensive experiments on in-the-wild and in-the-lab datasets demonstrate that our UA-FER outperforms the state-of-the-art models.

引用

页数：13

共 77 条

[1] Amini A., 2020, ADV NEURAL INFORM PR, V33, P14927, DOI DOI 10.48550/ARXIV.1910.02600
[2] Evidential Deep Learning for Open Set Action Recognition
Bao, Wentao
Yu, Qi
Kong, Yu
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13329 - 13338
[3] SchiNet: Automatic Estimation of Symptoms of Schizophrenia from Facial Behaviour Analysis
Bishay, Mina
Palasek, Petar
Priebe, Stefan
Patras, Ioannis
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2021, 12 (04) : 949 - 961
[4] Probabilistic Attribute Tree Structured Convolutional Neural Networks for Facial Expression Recognition in the Wild
Cai, Jie
Meng, Zibo
Khan, Ahmed Shehab
Li, Zhiyuan
O'Reilly, James
Tong, Yan
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1927 - 1941
[5] Cameron Alan, 2019, GPS World, DOI DOI 10.1109/WCSP.2019.8928049
[6] Multi-Relations Aware Network for In-the-Wild Facial Expression Recognition
Chen, Dongliang
Wen, Guihua
Li, Huihui
Chen, Rui
Li, Cheng
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 3848 - 3859
[7] Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
Chen, Mengyuan
Gao, Junyu
Xu, Changsheng
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14741 - 14750
[8] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
Chen, Mengyuan
Gao, Junyu
Yang, Shicai
Xu, Changsheng
[J]. COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 192 - 208
[9] Histograms of oriented gradients for human detection
Dalal, N
Triggs, B
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
[10] Deng Danruo, P MACHINE LEARNING R

← 1 2 3 4 5 6 7 8 →