A facial depression recognition method based on hybrid multi-head cross attention network

被引:4
|
作者
Li, Yutong [1 ]
Liu, Zhenyu [1 ]
Zhou, Li [1 ]
Yuan, Xiaoyan [1 ]
Shangguan, Zixuan [1 ]
Hu, Xiping [1 ]
Hu, Bin [1 ]
机构
[1] Lanzhou Univ, Gansu Prov Key Lab Wearable Comp, Lanzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
facial depression recognition; convolutional neural networks; attention mechanism; automatic depression estimation; end-to-end network; TEXTURE CLASSIFICATION; APPEARANCE;
D O I
10.3389/fnins.2023.1188434
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
IntroductionDeep-learn methods based on convolutional neural networks (CNNs) have demonstrated impressive performance in depression analysis. Nevertheless, some critical challenges need to be resolved in these methods: (1) It is still difficult for CNNs to learn long-range inductive biases in the low-level feature extraction of different facial regions because of the spatial locality. (2) It is difficult for a model with only a single attention head to concentrate on various parts of the face simultaneously, leading to less sensitivity to other important facial regions associated with depression. In the case of facial depression recognition, many of the clues come from a few areas of the face simultaneously, e.g., the mouth and eyes. MethodsTo address these issues, we present an end-to-end integrated framework called Hybrid Multi-head Cross Attention Network (HMHN), which includes two stages. The first stage consists of the Grid-Wise Attention block (GWA) and Deep Feature Fusion block (DFF) for the low-level visual depression feature learning. In the second stage, we obtain the global representation by encoding high-order interactions among local features with Multi-head Cross Attention block (MAB) and Attention Fusion block (AFB). ResultsWe experimented on AVEC2013 and AVEC2014 depression datasets. The results of AVEC 2013 (RMSE = 7.38, MAE = 6.05) and AVEC 2014 (RMSE = 7.60, MAE = 6.01) demonstrated the efficacy of our method and outperformed most of the state-of-the-art video-based depression recognition approaches. DiscussionWe proposed a deep learning hybrid model for depression recognition by capturing the higher-order interactions between the depression features of multiple facial regions, which can effectively reduce the error in depression recognition and gives great potential for clinical experiments.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Distract Your Attention: Multi-Head Cross Attention Network for Facial Expression Recognition
    Wen, Zhengyao
    Lin, Wenzhong
    Wang, Tao
    Xu, Ge
    BIOMIMETICS, 2023, 8 (02)
  • [2] Lightweight Facial Expression Recognition Based on Hybrid Multiscale and Multi-Head Collaborative Attention
    Zhang, Haitao
    Zhuang, Xufei
    Gao, Xudong
    Mao, Rui
    Ren, Qing-Dao-Er-Ji
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT II, 2025, 15032 : 304 - 316
  • [3] Multi-Head Attention Affinity Diversity Sharing Network for Facial Expression Recognition
    Zheng, Caixia
    Liu, Jiayu
    Zhao, Wei
    Ge, Yingying
    Chen, Wenhe
    ELECTRONICS, 2024, 13 (22)
  • [4] An Automatic Depression Detection Method with Cross-Modal Fusion Network and Multi-head Attention Mechanism
    Li, Yutong
    Wang, Juan
    Liu, Zhenyu
    Zhou, Li
    Zhang, Haibo
    Tang, Cheng
    Hu, Xiping
    Hu, Bin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 252 - 264
  • [5] Deep Learning Based Mobilenet and Multi-Head Attention Model for Facial Expression Recognition
    Nouisser, Aicha
    Zouari, Ramzi
    Kherallah, Monji
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (3A) : 485 - 491
  • [6] Recognizing facial expressions based on pyramid multi-head grid and spatial attention network
    Zhang, Jianyang
    Wang, Wei
    Li, Xiangyu
    Han, Yanjiang
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
  • [7] MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild
    Gong, Weijun
    Qian, Yurong
    Fan, Yingying
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (09): : 6529 - 6543
  • [8] MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild
    Weijun Gong
    Yurong Qian
    Yingying Fan
    Neural Computing and Applications, 2023, 35 : 6529 - 6543
  • [9] Local Multi-Head Channel Self-Attention for Facial Expression Recognition
    Pecoraro, Roberto
    Basile, Valerio
    Bono, Viviana
    INFORMATION, 2022, 13 (09)
  • [10] Classification of Facial Expression In-the-Wild based on Ensemble of Multi-head Cross Attention Networks
    Jeong, Jae Yeop
    Hong, Yeong-Gi
    Kim, Daun
    Jeong, Jin-Woo
    Jung, Yuchul
    Kim, Sang-Ho
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2352 - 2357