TriCAFFNet: A Tri-Cross-Attention Transformer with a Multi-Feature Fusion Network for Facial Expression Recognition

被引：1

作者：

Tian, Yuan ^{[1
]}

Wang, Zhao ^{[1
]}

Chen, Di ^{[1
]}

Yao, Huang ^{[1
]}

机构：

[1] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China

来源：

SENSORS | 2024年 / 24卷 / 16期

关键词：

facial expression recognition; vision transformer; multi-feature; tri-cross attention; CLASSIFICATION; SCALE;

D O I：

10.3390/s24165391

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively.

引用

页数：16

共 45 条

[1] Facial Emotion Recognition Using Transfer Learning in the Deep CNN [J].

Akhand, M. A. H. ;

Roy, Shuvendu ;

Siddique, Nazmul ;

Kamal, Md Abdus Samad ;

Shimamura, Tetsuya .

ELECTRONICS, 2021, 10 (09)

[2]

Appasaheb Borgalli Rohan, 2022, Journal of Physics: Conference Series, V2236, DOI 10.1088/1742-6596/2236/1/012004

[3] Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution [J].

Barsoum, Emad ;

Zhang, Cha ;

Ferrer, Cristian Canton ;

Zhang, Zhengyou .

ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :279-283

[4] FERNet: A Deep CNN Architecture for Facial Expression Recognition in the Wild [J].

Bodapati J.D. ;

Srilakshmi U. ;

Veeranjaneyulu N. .

Journal of The Institution of Engineers (India): Series B, 2022, 103 (02) :439-448

[5] Facial Expression Recognition in Video with Multiple Feature Fusion [J].

Chen, Junkai ;

Chen, Zenghai ;

Chi, Zheru ;

Fu, Hong .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2018, 9 (01) :38-50

[6] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[7] ArcFace: Additive Angular Margin Loss for Deep Face Recognition [J].

Deng, Jiankang ;

Guo, Jia ;

Xue, Niannan ;

Zafeiriou, Stefanos .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4685-4694

[8]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[9] FER-PCVT: Facial Expression Recognition with Patch-Convolutional Vision Transformer for Stroke Patients [J].

Fan, Yiming ;

Wang, Hewei ;

Zhu, Xiaoyu ;

Cao, Xiangming ;

Yi, Chuanjian ;

Chen, Yao ;

Jia, Jie ;

Lu, Xiaofeng .

BRAIN SCIENCES, 2022, 12 (12)

[10]

Guo Y., 2016, arXiv

← 1 2 3 4 5 →