Context Transformer and Adaptive Method with Visual Transformer for Robust Facial Expression Recognition

被引:7
作者
Xiong, Lingxin [1 ]
Zhang, Jicun [2 ]
Zheng, Xiaojia [1 ]
Wang, Yuxin [1 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116000, Peoples R China
[2] Neusoft Reach Automot Technol Dalian Co Ltd, Dalian 116000, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 04期
关键词
facial expression recognition; CoT; adaptive method; ViT; complex scenes;
D O I
10.3390/app14041535
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In real-world scenarios, the facial expression recognition task faces several challenges, including lighting variations, image noise, face occlusion, and other factors, which limit the performance of existing models in dealing with complex situations. To cope with these problems, we introduce the CoT module between the CNN and ViT frameworks, which improves the ability to perceive subtle differences by learning the correlations between local area features at a fine-grained level, helping to maintain the consistency between the local area features and the global expression, and making the model more adaptable to complex lighting conditions. Meanwhile, we adopt an adaptive learning method to effectively eliminate the interference of noise and occlusion by dynamically adjusting the parameters of the Transformer Encoder's self-attention weight matrix. Experiments demonstrate the accuracy of our CoT_AdaViT model in the Oulu-CASIA dataset as (NIR: 87.94%, VL: strong: 89.47%, weak: 84.76%, dark: 82.28%). As well as, CK+, RAF-DB, and FERPlus datasets achieved 99.20%, 91.07%, and 90.57% recognition results, which achieved excellent performance and verified that the model has strong recognition accuracy and robustness in complex scenes.
引用
收藏
页数:16
相关论文
共 52 条
[1]  
Aouayeb M, 2021, Arxiv, DOI arXiv:2107.03107
[2]   Attention-Guided Network Model for Image-Based Emotion Recognition [J].
Arabian, Herag ;
Battistel, Alberto ;
Chase, J. Geoffrey ;
Moeller, Knut .
APPLIED SCIENCES-BASEL, 2023, 13 (18)
[3]   Comparative Analysis of Vision Transformer Models for Facial Emotion Recognition Using Augmented Balanced Datasets [J].
Bobojanov, Sukhrob ;
Kim, Byeong Man ;
Arabboev, Mukhriddin ;
Begmatov, Shohruh .
APPLIED SCIENCES-BASEL, 2023, 13 (22)
[4]   Three-Stream Convolutional Neural Network with Squeeze-and-Excitation Block for Near-Infrared Facial Expression Recognition [J].
Chen, Ying ;
Zhang, Zhihao ;
Zhong, Lei ;
Chen, Tong ;
Chen, Juxiang ;
Yu, Yeda .
ELECTRONICS, 2019, 8 (04)
[5]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[6]   Fine-Tuning Swin Transformer and Multiple Weights Optimality-Seeking for Facial Expression Recognition [J].
Feng, Hongqi ;
Huang, Weikai ;
Zhang, Denghui ;
Zhang, Bangze .
IEEE ACCESS, 2023, 11 :9995-10003
[7]   Facial Expression Recognition in the Wild for Low-Resolution Images Using Voting Residual Network [J].
Gomez-Sirvent, Jose L. ;
de la Rosa, Francisco Lopez ;
Lopez, Maria T. ;
Fernandez-Caballero, Antonio .
ELECTRONICS, 2023, 12 (18)
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]   FADS: An Intelligent Fatigue and Age Detection System [J].
Hijji, Mohammad ;
Yar, Hikmat ;
Ullah, Fath U. Min ;
Alwakeel, Mohammed M. ;
Harrabi, Rafika ;
Aradah, Fahad ;
Cheikh, Faouzi Alaya ;
Muhammad, Khan ;
Sajjad, Muhammad .
MATHEMATICS, 2023, 11 (05)
[10]   Facial expression recognition with grid-wise attention and visual transformer [J].
Huang, Qionghao ;
Huang, Changqin ;
Wang, Xizhe ;
Jiang, Fan .
INFORMATION SCIENCES, 2021, 580 :35-54