Twinned attention network for occlusion-aware facial expression recognition

被引:1
作者
Devasena, G. [1 ]
Vidhya, V. [1 ]
机构
[1] Indian Inst Informat Technol, Dept Comp Sci & Engn, Tiruchirappalli, Tamilnadu, India
关键词
Facial expression recognition; Occluded images; Attention mechanism; REPRESENTATION; FEATURES;
D O I
10.1007/s00138-024-01641-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facial expression recognition (FER) is a tedious task in image processing for complex real-world scenarios that are captured under different lighting conditions, facial obstructions, and a diverse range of facial orientations. To address this issue, a novel Twinned attention network (Twinned-Att) is proposed in this paper for an efficient FER in occluded images. The proposed Twinned-Att network is designed in two separate modules: Holistic module (HM) and landmark centric module (LCM). The holistic module comprises of dual coordinate attention block (Dual-CA) and the Cross Convolution block (Cross-conv). The Dual-CA block is essential for learning positional, spatial, and contextual information by highlighting the most prominent characteristics in the facial regions. The Cross-conv block learns the spatial inter-dependencies and correlations to identify complex relationships between various facial regions. The LCM emphasizes smaller and distinct local regions while maintaining resilience against occlusions. Vigorous experiments have been undertaken to improve the efficacy of the proposed Twinned-Att. The results produced by the Twinned-Att illustrate the remarkable responses which achieve the accuracies of 86.92%, 85.64%, 78.40%, 69.82%, 64.71%, 85.52%, and 85.83% for the datasets viz., RAF DB, FER PLUS, FER 2013, FED RO, SFEW 2.0, occluded RAF DB and occluded FER Plus respectively. The proposed Twinned-Att network is experimented with various backbone networks, including Resnet-18, Resnet-50, and Resnet-152. It consistently outperforms well and highlights its prowess in addressing the challenges of robust FER in the images captured in complex real-world environments.
引用
收藏
页数:18
相关论文
共 51 条
[11]   Facial Expression Recognition Under Partial Occlusion from Virtual Reality Headsets based on Transfer Learning [J].
Houshmand, Bita ;
Khan, Naimul Mefraz .
2020 IEEE SIXTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2020), 2020, :70-75
[12]   Pseudo Label Association and Prototype-Based Invariant Learning for Semi-Supervised NIR-VIS Face Recognition [J].
Hu, Weipeng ;
Yang, Yiming ;
Hu, Haifeng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 :1448-1463
[13]   Adversarial Decoupling and Modality-Invariant Representation Learning for Visible-Infrared Person Re-Identification [J].
Hu, Weipeng ;
Liu, Bohong ;
Zeng, Haitang ;
Hou, Yanke ;
Hu, Haifeng .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) :5095-5109
[14]   Domain-Private Factor Detachment Network for NIR-VIS Face Recognition [J].
Hu, Weipeng ;
Hu, Haifeng .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 :1435-1449
[15]   Adversarial Disentanglement Spectrum Variations and Cross-Modality Attention Networks for NIR-VIS Face Recognition [J].
Hu, Weipeng ;
Hu, Haifeng .
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 :145-160
[16]  
Pandey RK, 2019, Arxiv, DOI arXiv:1902.05411
[17]  
Li H., 2023, IEEE Trans. Affect. Comput.
[18]   LBAN-IL: A novel method of high discriminative representation for facial expression recognition [J].
Li, Hangyu ;
Wang, Nannan ;
Yu, Yi ;
Yang, Xi ;
Gao, Xinbo .
NEUROCOMPUTING, 2021, 432 :159-169
[19]  
Li Y., 2020, IEEE Trans. Affect. Comput.
[20]   A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition [J].
Liang, Xingcan ;
Xu, Linsen ;
Zhang, Wenxiang ;
Zhang, Yan ;
Liu, Jinfu ;
Liu, Zhipeng .
VISUAL COMPUTER, 2023, 39 (06) :2277-2290