Gesture Recognition and Multi-modal Fusion on a New Hand Gesture Dataset

被引:1
作者
Schak, Monika [1 ]
Gepperth, Alexander [1 ]
机构
[1] Fulda Univ Appl Sci, D-36037 Fulda, Germany
来源
PATTERN RECOGNITION APPLICATIONS AND METHODS, ICPRAM 2021, ICPRAM 2022 | 2023年 / 13822卷
关键词
Hand gestures; Dataset; Multi-modal data; Data fusion; Sequence classification; Gesture recognition;
D O I
10.1007/978-3-031-24538-1_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a baseline for gesture recognition using state-of-the-art sequence classifiers on a new freely available multi-modal dataset of free-hand gestures. The dataset consists of roughly 100,000 samples, grouped into six classes of typical and easy-to-learn hand gestures. The dataset was recorded using two independent sensors, allowing for experiments on multi-modal data fusion at several depth levels and allowing research on multi-modal fusion for early, intermediate, and late fusion techniques. Since the whole dataset was recorded by a single person we ensure a very high quality of data with little to no risk for incorrectly performed gestures. We show the results of our experiments on unimodal sequence classification using a LSTM as well as a CNN classifier. We also show that multi-modal fusion of all four modalities results in higher precision using late-fusion of the output layer of an LSTM classifier trained on a single modality. Finally, we demonstrate that it is possible to perform live gesture classification using an LSTM-based gesture classifier, showing that generalization to other persons performing the gestures is high.
引用
收藏
页码:76 / 97
页数:22
相关论文
共 30 条
[1]   Multisensory integration: psychophysics, neurophysiology, and computation [J].
Angelaki, Dora E. ;
Gu, Yong ;
DeAngelis, Gregory C. .
CURRENT OPINION IN NEUROBIOLOGY, 2009, 19 (04) :452-458
[2]  
[Anonymous], 2013, TWENTYTHIRD INT JOIN
[3]   See me, hear me, touch me: multisensory integration in lateral occipital-temporal cortex [J].
Beauchamp, MS .
CURRENT OPINION IN NEUROBIOLOGY, 2005, 15 (02) :145-153
[4]  
Becker S., 2018, INTERPRETING EXPLAIN
[5]   Improving Human Action Recognition Using Fusion of Depth Camera and Inertial Sensors [J].
Chen, Chen ;
Jafari, Roozbeh ;
Kehtarnavaz, Nasser .
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2015, 45 (01) :51-61
[6]   Humans integrate visual and haptic information in a statistically optimal fashion [J].
Ernst, MO ;
Banks, MS .
NATURE, 2002, 415 (6870) :429-433
[7]  
Escalera S, 2013, ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, P365
[8]   A Generative Learning Approach to Sensor Fusion and Change Detection [J].
Gepperth, Alexander R. T. ;
Hecht, Thomas ;
Gogate, Mandar .
COGNITIVE COMPUTATION, 2016, 8 (05) :806-817
[9]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
[10]   Real-time 3D pointing gesture recognition for natural HCI [J].
Guan, Yepeng ;
Zheng, Mingen .
2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, :2433-+