One-shot learning hand gesture recognition based on modified 3d convolutional neural networks

被引:2
作者
Zhi Lu
Shiyin Qin
Xiaojie Li
Lianwei Li
Dinghao Zhang
机构
[1] Beihang University,School of Automation Science and Electrical Engineering
来源
Machine Vision and Applications | 2019年 / 30卷
关键词
One-shot learning hand gesture recognition; Convolutional neural networks (CNN); Multimodal feature fusion; Continuous fine-tune; Transfer learning;
D O I
暂无
中图分类号
学科分类号
摘要
Though deep neural networks have played a very important role in the field of vision-based hand gesture recognition, however, it is challenging to acquire large numbers of annotated samples to support its deep learning or training. Furthermore, in practical applications it often encounters some case with only one single sample for a new gesture class so that conventional recognition method cannot be qualified with a satisfactory classification performance. In this paper, the methodology of transfer learning is employed to build an effective network architecture of one-shot learning so as to deal with such intractable problem. Then some useful knowledge from deep training with big dataset of relative objects can be transferred and utilized to strengthen one-shot learning hand gesture recognition (OSLHGR) rather than to train a network from scratch. According to this idea a well-designed convolutional network architecture with deeper layers, C3D (Tran et al. in: ICCV, pp 4489–4497, 2015), is modified as an effective tool to extract spatiotemporal feature by deep learning. Then continuous fine-tune training is performed on a sample of new classes to complete one-shot learning. Moreover, the test of classification is carried out by Softmax classifier and geometrical classification based on Euclidean distance. Finally, a series of experiments and tests on two benchmark datasets, VIVA (Vision for Intelligent Vehicles and Applications) and SKIG (Sheffield Kinect Gesture) are conducted to demonstrate its state-of-the-art recognition accuracy of our proposed method. Meanwhile, a special dataset of gestures, BSG, is built using SoftKinetic DS325 for the test of OSLHGR, and a series of test results verify and validate its well classification performance and real-time response speed.
引用
收藏
页码:1157 / 1180
页数:23
相关论文
共 87 条
[41]  
Wan J(undefined)undefined undefined undefined undefined-undefined
[42]  
Guo G(undefined)undefined undefined undefined undefined-undefined
[43]  
Li SZ(undefined)undefined undefined undefined undefined-undefined
[44]  
Mahbub U(undefined)undefined undefined undefined undefined-undefined
[45]  
Imtiaz H(undefined)undefined undefined undefined undefined-undefined
[46]  
Roy T(undefined)undefined undefined undefined undefined-undefined
[47]  
Rahman MS(undefined)undefined undefined undefined undefined-undefined
[48]  
Ahad MAR(undefined)undefined undefined undefined undefined-undefined
[49]  
Zhu G(undefined)undefined undefined undefined undefined-undefined
[50]  
Zhang L(undefined)undefined undefined undefined undefined-undefined