ViTFER: Facial Emotion Recognition with Vision Transformers

被引:44
作者
Chaudhari, Aayushi [1 ]
Bhatt, Chintan [2 ]
Krishna, Achyut [1 ]
Mazzeo, Pier Luigi [3 ]
机构
[1] Charotar Univ Sci & Technol CHARUSAT, Chandubhai S Patel Inst Technol CSPIT, U&P U Patel Dept Comp Engn, CHARUSAT Campus, Changa 388421, India
[2] Pandit Deendayal Energy Univ, Sch Technol, Dept Comp Sci & Engn, Gandhinagar 382007, India
[3] Inst Appl Sci & Intelligent Syst, Natl Res Council Italy, I-73100 Lecce, Italy
关键词
computer vision; emotion recognition; ResNet; transformers; Vision Transformers;
D O I
10.3390/asi5040080
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In several fields nowadays, automated emotion recognition has been shown to be a highly powerful tool. Mapping different facial expressions to their respective emotional states is the main objective of facial emotion recognition (FER). In this study, facial expression recognition (FER) was classified using the ResNet-18 model and transformers. This study examines the performance of the Vision Transformer in this task and contrasts our model with cutting-edge models on hybrid datasets. The pipeline and associated procedures for face detection, cropping, and feature extraction using the most recent deep learning model, fine-tuned transformer, are described in this study. The experimental findings demonstrate that our proposed emotion recognition system is capable of being successfully used in practical settings.
引用
收藏
页数:16
相关论文
共 31 条
[1]  
Alshamsi H, 2017, 2017 8TH IEEE ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON), P384, DOI 10.1109/IEMCON.2017.8117150
[2]   Feature Pooling of Modulation Spectrum Features for Improved Speech Emotion Recognition in the Wild [J].
Avila, Anderson R. ;
Akhtar, Zahid ;
Santos, Joao F. ;
O'Shaughnessy, Douglas ;
Falk, Tiago H. .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2021, 12 (01) :177-188
[3]  
Chang CY, 2010, WORLD C COMPUTATIONA
[4]  
Cohn J, 1995, American Psychological Society
[5]   FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition [J].
Ding, Hui ;
Zhou, Shaohua Kevin ;
Chellappa, Rama .
2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, :118-126
[6]  
Dosovitskiy Alexey., 2021, PROC INT C LEARN REP, P2021
[7]   MULTIMODAL TRANSFORMER WITH LEARNABLE FRONTEND AND SELF ATTENTION FOR EMOTION RECOGNITION [J].
Dutta, Soumya ;
Ganapathy, Sriram .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :6917-6921
[8]   CONSTANTS ACROSS CULTURES IN FACE AND EMOTION [J].
EKMAN, P ;
FRIESEN, WV .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1971, 17 (02) :124-&
[9]  
Ekman P., 1977, FACIAL ACTION CODING
[10]  
Ekman P., 2006, Darwin and facial expression a century of research in review, P169, DOI DOI 10.1196/ANNALS.1280.010