Seeking a Hierarchical Prototype for Multimodal Gesture Recognition

被引:4
|
作者
Li, Yunan [1 ,2 ]
Qi, Tianyu [3 ]
Ma, Zhuoqi [3 ]
Quan, Dou [4 ]
Miao, Qiguang [1 ,2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian Key Lab Big Data & Intelligent Vis, Key Lab Smart Human Comp Interact & Wearable Techn, Xian 710071, Peoples R China
[2] Xidian Univ, Key Lab Collaborat Intelligence Syst, Minist Educ, Xian 710071, Peoples R China
[3] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
[4] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Generative adversarial network (GAN); gesture prototype; gesture recognition; memory bank; multimodal; NETWORKS; DATASET; FUSION;
D O I
10.1109/TNNLS.2023.3295811
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gesture recognition has drawn considerable attention from many researchers owing to its wide range of applications. Although significant progress has been made in this field, previous works always focus on how to distinguish between different gesture classes, ignoring the influence of inner-class divergence caused by gesture-irrelevant factors. Meanwhile, for multimodal gesture recognition, feature or score fusion in the final stage is a general choice to combine the information of different modalities. Consequently, the gesture-relevant features in different modalities may be redundant, whereas the complementarity of modalities is not exploited sufficiently. To handle these problems, we propose a hierarchical gesture prototype framework to highlight gesture-relevant features such as poses and motions in this article. This framework consists of a sample-level prototype and a modal-level prototype. The sample-level gesture prototype is established with the structure of a memory bank, which avoids the distraction of gesture-irrelevant factors in each sample, such as the illumination, background, and the performers' appearances. Then the modal-level prototype is obtained via a generative adversarial network (GAN)-based subnetwork, in which the modal-invariant features are extracted and pulled together. Meanwhile, the modal-specific attribute features are used to synthesize the feature of other modalities, and the circulation of modality information helps to leverage their complementarity. Extensive experiments on three widely used gesture datasets demonstrate that our method is effective to highlight gesture-relevant features and can outperform the state-of-the-art methods.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [21] CGMV-EGR: A multimodal fusion framework for electromyographic gesture recognition
    Wang, Weihao
    Liu, Yan
    Song, Fanghao
    Lu, Jingyu
    Qu, Jianing
    Guo, Junqing
    Huang, Jinming
    PATTERN RECOGNITION, 2025, 162
  • [22] Multimodal Gesture Recognition Based on the ResC3D Network
    Miao, Qiguang
    Li, Yunan
    Ouyang, Wanli
    Ma, Zhenxin
    Xu, Xin
    Shi, Weikang
    Cao, Xiaochun
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 3047 - 3055
  • [23] Prototype Gesture Recognition Interface for Vehicular Head-Up Display System
    Lagoo, Ramesh
    Charissis, Vassilis
    Chan, Warren
    Khan, Soheeb
    Harrison, David
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
  • [24] Dual-Modal Gesture Recognition Using Adaptive Weight Hierarchical Soft Voting Mechanism
    Zhang, Yue
    Wei, Sheng
    Wang, Zheng
    Liu, Honghai
    IEEE TRANSACTIONS ON CYBERNETICS, 2025,
  • [25] Continuous hand gesture recognition in the learned hierarchical latent variable space
    Han, Lei
    Liang, Wei
    ARTICULATED MOTION AND DEFORMABLE OBJECTS, PROCEEDINGS, 2008, 5098 : 32 - 41
  • [26] Hierarchical hand gesture recognition model for virtual reality applications in medicine
    Silva, AM
    Moreno, J
    León-Rojas, JM
    Andrés, F
    METMBS'00: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, VOLS I AND II, 2000, : 535 - 540
  • [27] ChAirGest - A Challenge for Multimodal Mid-Air Gesture Recognition for Close HCI
    Ruffieux, Simon
    Lalanne, Denis
    Mugellini, Elena
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 483 - 488
  • [28] A Multimodal Dynamic Hand Gesture Recognition Based on Radar-Vision Fusion
    Liu, Haoming
    Liu, Zhenyu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [29] A Multimodal Multilevel Converged Attention Network for Hand Gesture Recognition With Hybrid sEMG and A-Mode Ultrasound Sensing
    Wei, Sheng
    Zhang, Yue
    Liu, Honghai
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (12) : 7723 - 7734
  • [30] Hierarchical Attention-Based Astronaut Gesture Recognition: A Dataset and CNN Model
    Gu Lingyun
    Zhang Lin
    Wang Zhaokui
    IEEE ACCESS, 2020, 8 (08): : 68787 - 68798