Multimodal Dynamic Networks for Gesture Recognition

被引:10
|
作者
Wu, Di [1 ]
Shao, Ling [1 ]
机构
[1] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
来源
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14) | 2014年
关键词
Gesture Recognition; Human-Computer Interaction; Multimodal Fusion; Deep Belief Networks;
D O I
10.1145/2647868.2654969
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multimodal input is a real-world situation in gesture recognition applications such as sign language recognition. In this paper, we propose a novel bi-modal (audio and skeleton joints) dynamic network for gesture recognition. First, state-of-the-art dynamic Deep Belief Networks are deployed to extract high level audio and skeletal joints representations. Then, instead of traditional late fusion, we adopt another layer of perceptron for cross modality learning taking the input from each individual net's penultimate layer. Finally, to account for temporal dynamics, the learned shared representations are used for estimating the emission probability to infer action sequences. In particular, we demonstrate that multimodal feature learning will extract semantically meaningful shared representations, outperforming individual modalities, and the early fusion scheme's efficacy against the traditional method of late fusion.
引用
收藏
页码:945 / 948
页数:4
相关论文
共 50 条
  • [1] Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition
    Wu, Di
    Pigou, Lionel
    Kindermans, Pieter-Jan
    Nam Do-Hoang Le
    Shao, Ling
    Dambre, Joni
    Odobez, Jean-Marc
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) : 1583 - 1597
  • [2] Deep Dynamic Neural Networks for Gesture Segmentation and Recognition
    Wu, Di
    Shao, Ling
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 552 - 571
  • [3] Dynamic Gesture Recognition Based On Multimodal Fusion Model
    Fang, Juan
    Xu, Chao
    Wang, Chao
    Li, Hua
    20TH INT CONF ON UBIQUITOUS COMP AND COMMUNICAT (IUCC) / 20TH INT CONF ON COMP AND INFORMATION TECHNOLOGY (CIT) / 4TH INT CONF ON DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) / 11TH INT CONF ON SMART COMPUTING, NETWORKING, AND SERV (SMARTCNS), 2021, : 172 - 177
  • [4] A Multimodal Dynamic Hand Gesture Recognition Based on Radar-Vision Fusion
    Liu, Haoming
    Liu, Zhenyu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [5] Challenges in multimodal gesture recognition
    Escalera, Sergio
    Athitsos, Vassilis
    Guyon, Isabelle
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [6] Multimodal fusion hierarchical self-attention network for dynamic hand gesture recognition
    Balaji, Pranav
    Prusty, Manas Ranjan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 98
  • [7] Seeking a Hierarchical Prototype for Multimodal Gesture Recognition
    Li, Yunan
    Qi, Tianyu
    Ma, Zhuoqi
    Quan, Dou
    Miao, Qiguang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 198 - 209
  • [8] Dynamic Gesture Recognition Based on MEMP Network
    Zhang, Xinyu
    Li, Xiaoqiang
    FUTURE INTERNET, 2019, 11 (04):
  • [9] Seeking a Hierarchical Prototype for Multimodal Gesture Recognition
    Li, Yunan
    Qi, Tianyu
    Ma, Zhuoqi
    Quan, Dou
    Miao, Qiguang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
  • [10] Gesture Recognition in Robotic Surgery With Multimodal Attention
    van Amsterdam, Beatrice
    Funke, Isabel
    Edwards, Eddie
    Speidel, Stefanie
    Collins, Justin
    Sridhar, Ashwin
    Kelly, John
    Clarkson, Matthew J.
    Stoyanov, Danail
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (07) : 1677 - 1687