Multimodal fusion hierarchical self-attention network for dynamic hand gesture recognition

被引:8
|
作者
Balaji, Pranav [1 ]
Prusty, Manas Ranjan [2 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai, India
[2] Vellore Inst Technol, Ctr Cyber Phys Syst, Chennai, India
关键词
Dynamic hand gesture recognition; Multimodal fusion; Cross; -attention; Transformer; SHREC '17 track dataset;
D O I
10.1016/j.jvcir.2023.104019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent improvements in dynamic hand gesture recognition have seen a shift from traditional convolutional architectures to attention-based networks. These attention networks have been proven to outclass CNN + LSTM architectures, showing higher accuracy as well as reduced model parameters. Especially, skeleton-based attention networks have been shown to outperform visual-based networks due to the rich information from skeletonbased hand features. However, there is an opportunity to introduce complementary features from other modalities like RGB, depth, and optical flow images to enhance the recognition capability of skeleton-based networks. This paper aims to explore the addition of a multimodal fusion network to a skeleton-based Hierarchical Self-Attention Network (MF-HAN) and test for increased model effectiveness. Unlike traditional fusion techniques, this fusion network uses features derived from other sources of multimodal data in a reduced feature space using a cross-attention layer. The model outperforms its root model and other state-of-the-art models on the SHREC'17 track dataset, especially in the 28 gestures setting by more than 1 % in gesture classification accuracy. The experimentation was tested on the DHG dataset as well.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Attention-guided Multi-step Fusion: A Hierarchical Fusion Network for Multimodal Recommendation
    Zhou, Yan
    Guo, Jie
    Sun, Hao
    Song, Bin
    Yu, Fei Richard
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1816 - 1820
  • [22] Hand gesture recognition based on attentive feature fusion
    Yu, Bin
    Luo, Zhiming
    Wu, Huangbin
    Li, Shaozi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (22)
  • [23] Comparison of Algorithms for Dynamic Hand Gesture Recognition
    Kajan, Slavomir
    Goga, Jozef
    Zsiros, Ondrej
    PROCEEDINGS OF THE 2020 30TH INTERNATIONAL CONFERENCE CYBERNETICS & INFORMATICS (K&I '20), 2020,
  • [24] Masked face recognition based on knowledge distillation and convolutional self-attention network
    Wan, Weiguo
    Wen, Runlin
    Yao, Li
    Yang, Yong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2269 - 2284
  • [25] RPFNET: COMPLEMENTARY FEATURE FUSION FOR HAND GESTURE RECOGNITION
    Kim, Do Yeon
    Kim, Dae Ha
    Song, Byung Cheol
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 986 - 990
  • [26] Spatio-Temporal 3D Action Recognition with Hierarchical Self-Attention Mechanism
    Araei, Soheil
    Nadian-Ghomsheh, Ali
    2021 26TH INTERNATIONAL COMPUTER CONFERENCE, COMPUTER SOCIETY OF IRAN (CSICC), 2021,
  • [27] Attentional control and the self: The Self-Attention Network (SAN)
    Humphreys, Glyn W.
    Sui, Jie
    COGNITIVE NEUROSCIENCE, 2016, 7 (1-4) : 5 - 17
  • [28] Dyhand: dynamic hand gesture recognition using BiLSTM and soft attention methodsDyHand: dynamic hand gesture recognition using BiLSTM and soft attention methodsR. P. Singh, L. D. Singh
    Rohit Pratap Singh
    Laiphrakpam Dolendro Singh
    The Visual Computer, 2025, 41 (1) : 41 - 51
  • [29] SIFNet: A self-attention interaction fusion network for multisource satellite imagery template matching
    Liu, Ming
    Zhou, Gaoxiang
    Ma, Lingfei
    Li, Liangzhi
    Mei, Qiong
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 118
  • [30] Dynamic Hand Gesture Recognition With Leap Motion Controller
    Lu, Wei
    Tong, Zheng
    Chu, Jinghui
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (09) : 1188 - 1192