Real-Time Hand Gesture Recognition Using Fine-Tuned Convolutional Neural Network

被引:69
作者
Sahoo, Jaya Prakash [1 ]
Prakash, Allam Jaya [1 ]
Plawiak, Pawel [2 ,3 ]
Samantray, Saunak [4 ]
机构
[1] Natl Inst Technol, Dept Elect & Commun Engn, Rourkela 769008, Odisha, India
[2] Cracow Univ Technol, Fac Comp Sci & Telecommun, Dept Comp Sci, Warszawska 24, PL-31155 Krakow, Poland
[3] Polish Acad Sci, Inst Theoret & Appl Informat, Baltycka 5, PL-44100 Gliwice, Poland
[4] IIIT Bhubaneswar, Dept Elect & Tele Commun Engn, Bhubaneswar 751003, Odisha, India
关键词
ASL; fine-tunning; hand gesture recognition; pre-trained CNN; real-time gesture recognition; score fusion; FUSION;
D O I
10.3390/s22030706
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Hand gesture recognition is one of the most effective modes of interaction between humans and computers due to being highly flexible and user-friendly. A real-time hand gesture recognition system should aim to develop a user-independent interface with high recognition performance. Nowadays, convolutional neural networks (CNNs) show high recognition rates in image classification problems. Due to the unavailability of large labeled image samples in static hand gesture images, it is a challenging task to train deep CNN networks such as AlexNet, VGG-16 and ResNet from scratch. Therefore, inspired by CNN performance, an end-to-end fine-tuning method of a pre-trained CNN model with score-level fusion technique is proposed here to recognize hand gestures in a dataset with a low number of gesture images. The effectiveness of the proposed technique is evaluated using leave-one-subject-out cross-validation (LOO CV) and regular CV tests on two benchmark datasets. A real-time American sign language (ASL) recognition system is developed and tested using the proposed technique.
引用
收藏
页数:14
相关论文
共 37 条
[31]  
Suarez J., 2012, 2012 RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, P411, DOI 10.1109/ROMAN.2012.6343787
[32]  
Szegedy Christian, 2015, P IEEE C COMP VIS PA, P1, DOI [10.1109/cvpr.2015.7298594, DOI 10.1109/CVPR.2015.7298594]
[33]   Animal classification using facial images with score-level fusion [J].
Taheri, Shahram ;
Toygar, Onsen .
IET COMPUTER VISION, 2018, 12 (05) :679-685
[34]   Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? [J].
Tajbakhsh, Nima ;
Shin, Jae Y. ;
Gurudu, Suryakanth R. ;
Hurst, R. Todd ;
Kendall, Christopher B. ;
Gotway, Michael B. ;
Liang, Jianming .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (05) :1299-1312
[35]   American Sign Language alphabet recognition using Convolutional Neural Networks with multiview augmentation and inference fusion [J].
Tao, Wenjin ;
Leu, Ming C. ;
Yin, Zhaozheng .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2018, 76 :202-213
[36]   Vision-Based Hand-Gesture Applications [J].
Wachs, Juan Pablo ;
Koelsch, Mathias ;
Stern, Helman ;
Edan, Yael .
COMMUNICATIONS OF THE ACM, 2011, 54 (02) :60-71
[37]   Superpixel-Based Hand Gesture Recognition With Kinect Depth Camera [J].
Wang, Chong ;
Liu, Zhong ;
Chan, Shing-Chow .
IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (01) :29-39