Multimodal fusion hierarchical self-attention network for dynamic hand gesture recognition

被引:8
|
作者
Balaji, Pranav [1 ]
Prusty, Manas Ranjan [2 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai, India
[2] Vellore Inst Technol, Ctr Cyber Phys Syst, Chennai, India
关键词
Dynamic hand gesture recognition; Multimodal fusion; Cross; -attention; Transformer; SHREC '17 track dataset;
D O I
10.1016/j.jvcir.2023.104019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent improvements in dynamic hand gesture recognition have seen a shift from traditional convolutional architectures to attention-based networks. These attention networks have been proven to outclass CNN + LSTM architectures, showing higher accuracy as well as reduced model parameters. Especially, skeleton-based attention networks have been shown to outperform visual-based networks due to the rich information from skeletonbased hand features. However, there is an opportunity to introduce complementary features from other modalities like RGB, depth, and optical flow images to enhance the recognition capability of skeleton-based networks. This paper aims to explore the addition of a multimodal fusion network to a skeleton-based Hierarchical Self-Attention Network (MF-HAN) and test for increased model effectiveness. Unlike traditional fusion techniques, this fusion network uses features derived from other sources of multimodal data in a reduced feature space using a cross-attention layer. The model outperforms its root model and other state-of-the-art models on the SHREC'17 track dataset, especially in the 28 gestures setting by more than 1 % in gesture classification accuracy. The experimentation was tested on the DHG dataset as well.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A Multimodal Dynamic Hand Gesture Recognition Based on Radar-Vision Fusion
    Liu, Haoming
    Liu, Zhenyu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [2] MULTIMODAL CROSS- AND SELF-ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION
    Sun, Licai
    Liu, Bin
    Tao, Jianhua
    Lian, Zheng
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4275 - 4279
  • [3] CAPSULE TRANSFORMER NETWORK FOR DYNAMIC HAND GESTURE RECOGNITION USING MULTIMODAL DATA
    Lebas, Alexandre
    Slama, Rim
    Wannous, Hazem
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2130 - 2134
  • [4] A Hybrid Multimodal Fusion Framework for sEMG-ACC-Based Hand Gesture Recognition
    Duan, Shengcai
    Wu, Le
    Xue, Bo
    Liu, Aiping
    Qian, Ruobing
    Chen, Xun
    IEEE SENSORS JOURNAL, 2023, 23 (03) : 2773 - 2782
  • [5] A Multimodal Fusion Model Based on Hybrid Attention Mechanism for Gesture Recognition
    Li, Yajie
    Chen, Yiqiang
    Gu, Yang
    Ouyang, Jianquan
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 302 - 312
  • [6] A Multimodal Multilevel Converged Attention Network for Hand Gesture Recognition With Hybrid sEMG and A-Mode Ultrasound Sensing
    Wei, Sheng
    Zhang, Yue
    Liu, Honghai
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (12) : 7723 - 7734
  • [7] HandFormer: A Dynamic Hand Gesture Recognition Method Based on Attention Mechanism
    Zhang, Yun
    Wang, Fengping
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [8] A multimodal attention fusion network with a dynamic vocabulary for TextVQA
    Wu, Jiajia
    Du, Jun
    Wang, Fengren
    Yang, Chen
    Jiang, Xinzhe
    Hu, Jinshui
    Yin, Bing
    Zhang, Jianshu
    Dai, Lirong
    PATTERN RECOGNITION, 2022, 122
  • [9] Attention-Based Fusion of Directed Rotation Graphs for Skeleton-Based Dynamic Hand Gesture Recognition
    Xie, Ningwei
    Yu, Wei
    Yang, Lei
    Guo, Meng
    Li, Jie
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 293 - 304
  • [10] SFusion: Self-attention Based N-to-One Multimodal Fusion Block
    Liu, Zecheng
    Wei, Jia
    Li, Rui
    Zhou, Jianlong
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, 2023, 14221 : 159 - 169