A Novel Facial Expression Recognition (FER) Model Using Multi-scale Attention Network

被引:0
作者
Ghadai, Chakrapani [1 ]
Patra, Dipti [1 ]
Okade, Manish [1 ]
机构
[1] Natl Inst Technol, Rourkela, India
来源
COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II | 2024年 / 2010卷
关键词
Facial expression recognition (FER); Deep learning; Muti-scale; Attention; receptive field;
D O I
10.1007/978-3-031-58174-8_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facial Expression Recognition (FER) faces significant challenges, primarily due to significant variations within classes and subtle visual differences between classes, and limited dataset sizes. Real-world factors such as pose, illumination, and partial occlusion further hinder FER performance. To tackle these challenges, multi-scale and attention-based networks have been widely employed. However, previous approaches have primarily focused on increasing depth while neglecting width, resulting in an inadequate representation of granular facial expression features. This study introduces a novel FER model. A multi-scale attention network (MSA-Net) is designed as a more extensive and deeper network that captures features from various receptive fields through a parallel network structure. Each parallel branch in the proposed network utilizes channel complementary multi-scale blocks, e.g., left multi-scale (MS-L) and right multi-scale (MS-R), to broaden the effective receptive field and capture features having diversity. Additionally, attention networks are employed to emphasize important regions and boost the discriminative capability of the multi-scale features. The performance evaluation of the proposed method was carried out on two popular real-world FER databases: AffectNet and RAF-DB. Our MSA-Net has reduced the impact of the pose, partial occlusions and the network's susceptibility to subtle expression-related variations, thereby outperforming other methods in FER.
引用
收藏
页码:336 / 346
页数:11
相关论文
共 27 条
  • [1] Binte Ali Humayra, 2015, International Journal of Machine Learning and Computing, V5, P142, DOI 10.7763/IJMLC.2015.V5.498
  • [2] EmotiW 2016: Video and Group-Level Emotion Recognition Challenges
    Dhall, Abhinav
    Goecke, Roland
    Joshi, Jyoti
    Hoey, Jesse
    Gedeon, Tom
    [J]. ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 427 - 432
  • [3] Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015
    Dhall, Abhinav
    Murthy, O. V. Ramana
    Goecke, Roland
    Joshi, Jyoti
    Gedeon, Tom
    [J]. ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 423 - 426
  • [4] Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition
    Ding, Hui
    Zhou, Peng
    Chellappa, Rama
    [J]. IEEE/IAPR INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2020), 2020,
  • [5] Discriminant Distribution-Agnostic Loss for Facial Expression Recognition in the Wild
    Farzaneh, Amir Hossein
    Qi, Xiaojun
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1631 - 1639
  • [6] Florea C., 2019, Bmvc, P104
  • [7] Res2Net: A New Multi-Scale Backbone Architecture
    Gao, Shang-Hua
    Cheng, Ming-Ming
    Zhao, Kai
    Zhang, Xin-Yu
    Yang, Ming-Hsuan
    Torr, Philip
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 652 - 662
  • [8] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
  • [9] Hardjadinata Hannatassja, 2021, 2021 6th International Conference on New Media Studies (CONMEDIA), P60, DOI 10.1109/CONMEDIA53104.2021.9617173
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778