Multimodal Dynamic Networks for Gesture Recognition

被引：10

作者：

Wu, Di ^{[1
]}

Shao, Ling ^{[1
]}

机构：

[1] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England

来源：

PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14) | 2014年

关键词：

Gesture Recognition; Human-Computer Interaction; Multimodal Fusion; Deep Belief Networks;

D O I：

10.1145/2647868.2654969

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Multimodal input is a real-world situation in gesture recognition applications such as sign language recognition. In this paper, we propose a novel bi-modal (audio and skeleton joints) dynamic network for gesture recognition. First, state-of-the-art dynamic Deep Belief Networks are deployed to extract high level audio and skeletal joints representations. Then, instead of traditional late fusion, we adopt another layer of perceptron for cross modality learning taking the input from each individual net's penultimate layer. Finally, to account for temporal dynamics, the learned shared representations are used for estimating the emission probability to infer action sequences. In particular, we demonstrate that multimodal feature learning will extract semantically meaningful shared representations, outperforming individual modalities, and the early fusion scheme's efficacy against the traditional method of late fusion.

引用

页码：945 / 948

页数：4

共 50 条

[1] Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition
Wu, Di
Pigou, Lionel
Kindermans, Pieter-Jan
Nam Do-Hoang Le
Shao, Ling
Dambre, Joni
Odobez, Jean-Marc
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) : 1583 - 1597
[2] Deep Dynamic Neural Networks for Gesture Segmentation and Recognition
Wu, Di
Shao, Ling
COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 552 - 571
[3] Dynamic Gesture Recognition Based On Multimodal Fusion Model
Fang, Juan
Xu, Chao
Wang, Chao
Li, Hua
20TH INT CONF ON UBIQUITOUS COMP AND COMMUNICAT (IUCC) / 20TH INT CONF ON COMP AND INFORMATION TECHNOLOGY (CIT) / 4TH INT CONF ON DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) / 11TH INT CONF ON SMART COMPUTING, NETWORKING, AND SERV (SMARTCNS), 2021, : 172 - 177
[4] A Multimodal Dynamic Hand Gesture Recognition Based on Radar-Vision Fusion
Liu, Haoming
Liu, Zhenyu
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[5] Challenges in multimodal gesture recognition
Escalera, Sergio
Athitsos, Vassilis
Guyon, Isabelle
JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
[6] Multimodal fusion hierarchical self-attention network for dynamic hand gesture recognition
Balaji, Pranav
Prusty, Manas Ranjan
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 98
[7] Seeking a Hierarchical Prototype for Multimodal Gesture Recognition
Li, Yunan
Qi, Tianyu
Ma, Zhuoqi
Quan, Dou
Miao, Qiguang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 198 - 209
[8] Dynamic Gesture Recognition Based on MEMP Network
Zhang, Xinyu
Li, Xiaoqiang
FUTURE INTERNET, 2019, 11 (04):
[9] Seeking a Hierarchical Prototype for Multimodal Gesture Recognition
Li, Yunan
Qi, Tianyu
Ma, Zhuoqi
Quan, Dou
Miao, Qiguang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 12
[10] Gesture Recognition in Robotic Surgery With Multimodal Attention
van Amsterdam, Beatrice
Funke, Isabel
Edwards, Eddie
Speidel, Stefanie
Collins, Justin
Sridhar, Ashwin
Kelly, John
Clarkson, Matthew J.
Stoyanov, Danail
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (07) : 1677 - 1687

← 1 2 3 4 5 →