3D separable convolutional neural network for dynamic hand gesture recognition

被引：53

作者：

Hu, Zhongxu ^{[1
]}

Hu, Youmin ^{[1
]}

Liu, Jie ^{[3
]}

Wu, Bo ^{[1
]}

Han, Dongmin ^{[2
]}

Kurfess, Thomas ^{[2
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan, Hubei, Peoples R China

[2] Georgia Inst Technol, George W Woodruff Sch Mech Engn, Atlanta, GA 30332 USA

[3] Huazhong Univ Sci & Technol, Sch Hydropower & Informat Engn, Wuhan, Hubei, Peoples R China

来源：

NEUROCOMPUTING | 2018年 / 318卷

基金：

国家重点研发计划;

关键词：

Hand gesture recognition; 3D separable CNN; Skip connection; Layer-wise learning rate;

D O I：

10.1016/j.neucom.2018.08.042

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dynamic hand gesture recognition, as an essential part of Human-Computer Interaction, and especially an important way to realize Augmented Reality, has been attracting attention from many scholars and yet presenting many more challenges. Recently, being aware of deep convolutional neural network's excellent performance, many scholars began to apply it to gesture recognition, and obtained promising results. However, no enough attention has been paid to the number of parameters in the network and the amount of computer calculation needed until now. In this paper, a 3D separable convolutional neural network is proposed for dynamic gesture recognition. This study aims to make the model less complex without compromising its high recognition accuracy, such that it can be deployed to augmented reality glasses more easily in the future. By the application of skip connection and layer-wise learning rate, the undesired gradient dispersion due to the separation operation is solved and the performance of the network is improved. The fusion of feature information is further promoted by shuffle operation. In addition, a dynamic hand gesture library is built through HoloLens, which thus proves the feasibility of the proposed method. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：151 / 161

页数：11

共 43 条

[1]

[Anonymous], 2016, INT J APPL ENG RES

[2]

[Anonymous], 2011, International Encyclopedia of Statistical Science, DOI DOI 10.1007/978-3-642-04898-2_455

[3]

[Anonymous], 2017, CS CV

[4] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[5]

Dally WJ, 2015, ARXIV151000149, V2

[6]

Diba, 2016, ARXIV160808851

[7]

Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899

[8] Learning Spatiotemporal Features with 3D Convolutional Networks [J].

Du Tran ;

Bourdev, Lubomir ;

Fergus, Rob ;

Torresani, Lorenzo ;

Paluri, Manohar .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497

[9]

Elmezain M., 2010, Proceedings 2010 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2010), P131, DOI 10.1109/ISSPIT.2010.5711749

[10]

Elmezain Mahmoud, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3850, DOI 10.1109/ICPR.2010.938

← 1 2 3 4 5 →