Context Transformer and Adaptive Method with Visual Transformer for Robust Facial Expression Recognition

被引:6
作者
Xiong, Lingxin [1 ]
Zhang, Jicun [2 ]
Zheng, Xiaojia [1 ]
Wang, Yuxin [1 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116000, Peoples R China
[2] Neusoft Reach Automot Technol Dalian Co Ltd, Dalian 116000, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 04期
关键词
facial expression recognition; CoT; adaptive method; ViT; complex scenes;
D O I
10.3390/app14041535
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In real-world scenarios, the facial expression recognition task faces several challenges, including lighting variations, image noise, face occlusion, and other factors, which limit the performance of existing models in dealing with complex situations. To cope with these problems, we introduce the CoT module between the CNN and ViT frameworks, which improves the ability to perceive subtle differences by learning the correlations between local area features at a fine-grained level, helping to maintain the consistency between the local area features and the global expression, and making the model more adaptable to complex lighting conditions. Meanwhile, we adopt an adaptive learning method to effectively eliminate the interference of noise and occlusion by dynamically adjusting the parameters of the Transformer Encoder's self-attention weight matrix. Experiments demonstrate the accuracy of our CoT_AdaViT model in the Oulu-CASIA dataset as (NIR: 87.94%, VL: strong: 89.47%, weak: 84.76%, dark: 82.28%). As well as, CK+, RAF-DB, and FERPlus datasets achieved 99.20%, 91.07%, and 90.57% recognition results, which achieved excellent performance and verified that the model has strong recognition accuracy and robustness in complex scenes.
引用
收藏
页数:16
相关论文
共 55 条
  • [1] Aouayeb M, 2021, Arxiv, DOI [arXiv:2107.03107, DOI 10.1016/J.PATREC.2021.01.029]
  • [2] Attention-Guided Network Model for Image-Based Emotion Recognition
    Arabian, Herag
    Battistel, Alberto
    Chase, J. Geoffrey
    Moeller, Knut
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [3] Comparative Analysis of Vision Transformer Models for Facial Emotion Recognition Using Augmented Balanced Datasets
    Bobojanov, Sukhrob
    Kim, Byeong Man
    Arabboev, Mukhriddin
    Begmatov, Shohruh
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [4] Three-Stream Convolutional Neural Network with Squeeze-and-Excitation Block for Near-Infrared Facial Expression Recognition
    Chen, Ying
    Zhang, Zhihao
    Zhong, Lei
    Chen, Tong
    Chen, Juxiang
    Yu, Yeda
    [J]. ELECTRONICS, 2019, 8 (04)
  • [5] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [6] Triple attention feature enhanced pyramid network for facial expression recognition
    Fang, Jian
    Lin, Xiaomei
    Liu, Weida
    An, Yi
    Sun, Haoran
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (05) : 8649 - 8661
  • [7] Fine-Tuning Swin Transformer and Multiple Weights Optimality-Seeking for Facial Expression Recognition
    Feng, Hongqi
    Huang, Weikai
    Zhang, Denghui
    Zhang, Bangze
    [J]. IEEE ACCESS, 2023, 11 : 9995 - 10003
  • [8] Facial Expression Recognition in the Wild for Low-Resolution Images Using Voting Residual Network
    Gomez-Sirvent, Jose L.
    de la Rosa, Francisco Lopez
    Lopez, Maria T.
    Fernandez-Caballero, Antonio
    [J]. ELECTRONICS, 2023, 12 (18)
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] FADS: An Intelligent Fatigue and Age Detection System
    Hijji, Mohammad
    Yar, Hikmat
    Ullah, Fath U. Min
    Alwakeel, Mohammed M.
    Harrabi, Rafika
    Aradah, Fahad
    Cheikh, Faouzi Alaya
    Muhammad, Khan
    Sajjad, Muhammad
    [J]. MATHEMATICS, 2023, 11 (05)