MVT-CEAM: a lightweight MobileViT with channel expansion and attention mechanism for facial expression recognition

被引:2
作者
Wang, Kunxia [1 ,2 ]
Yu, Wancheng [1 ,2 ]
Yamauchi, Takashi [3 ]
机构
[1] Anhui Jianzhu Univ, Sch Elect & Informat Engn, Hefei 230601, Peoples R China
[2] Anhui Jianzhu Univ, Anhui Int Joint Res Ctr Ancient Architecture Intel, Hefei 230601, Peoples R China
[3] Texas A&M Univ, Dept Psychol & Brain Sci, College Stn, TX 77845 USA
关键词
Expression recognition; Transformer; Channel expansion; Attention mechanism; TRANSFORMER; NETWORK;
D O I
10.1007/s11760-024-03356-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Facial expression recognition is a crucial area of study in psychology that can be applied to many fields, such as intelligent healthcare, human-computer interaction, fuzzy control and other domains. However, current deep learning models usually encounter high complexity, expensive computational requirements and outsized parameters. These obstacles hinder the deployment of applications on resource-constrained mobile terminals. This paper proposes an improved lightweight MobileViT with channel expansion and attention mechanism for facial expression recognition to address these challenges. In this model, we adopt a channel expansion strategy to effectively extract more critical facial expression feature information from multi-scale feature maps. Furthermore, we introduce a channel attention module within the model to improve feature extraction performance. Compared with typical lightweight models, our proposed model significantly improves the accuracy rate while maintaining a lightweight network. Our proposed model achieves 94.35 and 87.41% accuracy on the KDEF and RAF-DB datasets, respectively, demonstrating superior recognition performance.
引用
收藏
页码:6853 / 6865
页数:13
相关论文
共 44 条
[21]   Patch attention convolutional vision transformer for facial expression recognition with occlusion [J].
Liu, Chang ;
Hirota, Kaoru ;
Dai, Yaping .
INFORMATION SCIENCES, 2023, 619 :781-794
[22]   CEAM-YOLOv7: Improved YOLOv7 Based on Channel Expansion and Attention Mechanism for Driver Distraction Behavior Detection [J].
Liu, Shugang ;
Wang, Yujie ;
Yu, Qiangguo ;
Liu, Hongli ;
Peng, Zhan .
IEEE ACCESS, 2022, 10 :129116-129124
[23]   Confusable facial expression recognition with geometry-aware conditional network [J].
Liu, Tong ;
Li, Jing ;
Wu, Jia ;
Du, Bo ;
Wan, Jun ;
Chang, Jun .
PATTERN RECOGNITION, 2024, 148
[24]   Dynamic multi-channel metric network for joint pose-aware and identity-invariant facial expression recognition [J].
Liu, Yuanyuan ;
Dai, Wei ;
Fang, Fang ;
Chen, Yongquan ;
Huang, Rui ;
Wang, Run ;
Wan, Bo .
INFORMATION SCIENCES, 2021, 578 (195-213) :195-213
[25]   Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].
Liu, Ze ;
Lin, Yutong ;
Cao, Yue ;
Hu, Han ;
Wei, Yixuan ;
Zhang, Zheng ;
Lin, Stephen ;
Guo, Baining .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002
[26]  
Lundqvist D., 1998, Karolinska directed emotional faces, DOI DOI 10.1037/T27732-000
[27]  
Mehta S., 2021, arXiv
[28]  
Rajawat Anand Singh, 2023, Procedia Computer Science, P2795, DOI 10.1016/j.procs.2023.01.251
[29]   Human Behavior Understanding in Big Multimedia Data Using CNN based Facial Expression Recognition [J].
Sajjad, Muhammad ;
Zahir, Sana ;
Ullah, Amin ;
Akhtar, Zahid ;
Muhammad, Khan .
MOBILE NETWORKS & APPLICATIONS, 2020, 25 (04) :1611-1621
[30]   MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].
Sandler, Mark ;
Howard, Andrew ;
Zhu, Menglong ;
Zhmoginov, Andrey ;
Chen, Liang-Chieh .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520