A Novel Facial Expression Recognition (FER) Model Using Multi-scale Attention Network

被引：0

作者：

Ghadai, Chakrapani ^{[1
]}

Patra, Dipti ^{[1
]}

Okade, Manish ^{[1
]}

机构：

[1] Natl Inst Technol, Rourkela, India

来源：

COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II | 2024年 / 2010卷

关键词：

Facial expression recognition (FER); Deep learning; Muti-scale; Attention; receptive field;

D O I：

10.1007/978-3-031-58174-8_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial Expression Recognition (FER) faces significant challenges, primarily due to significant variations within classes and subtle visual differences between classes, and limited dataset sizes. Real-world factors such as pose, illumination, and partial occlusion further hinder FER performance. To tackle these challenges, multi-scale and attention-based networks have been widely employed. However, previous approaches have primarily focused on increasing depth while neglecting width, resulting in an inadequate representation of granular facial expression features. This study introduces a novel FER model. A multi-scale attention network (MSA-Net) is designed as a more extensive and deeper network that captures features from various receptive fields through a parallel network structure. Each parallel branch in the proposed network utilizes channel complementary multi-scale blocks, e.g., left multi-scale (MS-L) and right multi-scale (MS-R), to broaden the effective receptive field and capture features having diversity. Additionally, attention networks are employed to emphasize important regions and boost the discriminative capability of the multi-scale features. The performance evaluation of the proposed method was carried out on two popular real-world FER databases: AffectNet and RAF-DB. Our MSA-Net has reduced the impact of the pose, partial occlusions and the network's susceptibility to subtle expression-related variations, thereby outperforming other methods in FER.

引用

页码：336 / 346

页数：11

共 27 条

[1] Binte Ali Humayra, 2015, International Journal of Machine Learning and Computing, V5, P142, DOI 10.7763/IJMLC.2015.V5.498
[2] EmotiW 2016: Video and Group-Level Emotion Recognition Challenges
Dhall, Abhinav
Goecke, Roland
Joshi, Jyoti
Hoey, Jesse
Gedeon, Tom
[J]. ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 427 - 432
[3] Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015
Dhall, Abhinav
Murthy, O. V. Ramana
Goecke, Roland
Joshi, Jyoti
Gedeon, Tom
[J]. ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 423 - 426
[4] Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition
Ding, Hui
Zhou, Peng
Chellappa, Rama
[J]. IEEE/IAPR INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2020), 2020,
[5] Discriminant Distribution-Agnostic Loss for Facial Expression Recognition in the Wild
Farzaneh, Amir Hossein
Qi, Xiaojun
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1631 - 1639
[6] Florea C., 2019, Bmvc, P104
[7] Res2Net: A New Multi-Scale Backbone Architecture
Gao, Shang-Hua
Cheng, Ming-Ming
Zhao, Kai
Zhang, Xin-Yu
Yang, Ming-Hsuan
Torr, Philip
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 652 - 662
[8] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
[9] Hardjadinata Hannatassja, 2021, 2021 6th International Conference on New Media Studies (CONMEDIA), P60, DOI 10.1109/CONMEDIA53104.2021.9617173
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 →