A Hybrid Approach to Hand Detection and Type Classification in Upper-Body Videos

被引:0
作者
Papadimitriou, Katerina [1 ]
Potamianos, Gerasimos [1 ,2 ]
机构
[1] Univ Thessaly, Elect & Comp Engn Dept, Volos 38221, Greece
[2] Athena Res & Innovat Ctr, Maroussi 15125, Greece
来源
PROCEEDINGS OF THE 2018 7TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP) | 2018年
基金
欧盟地平线“2020”;
关键词
Hand detection; hand type classification; region-based convolutional neural network (R-CNN); AdaBoost face detection; Kalman filtering; ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detection of hands in videos and their classification into left and right types are crucial in various human-computer interaction and data mining systems. A variety of effective deep learning methods have been proposed for this task, such as region-based convolutional neural networks (R-CNNs), however the large number of their proposal windows per frame deem them computationally intensive. For this purpose we propose a hybrid approach that is based on substituting the "selective search" R-CNN module by an image processing pipeline assuming visibility of the facial region, as for example in signing and cued speech videos. Our system comprises two main phases: preprocessing and classification. In the preprocessing stage we incorporate facial information, obtained by an AdaBoost face detector, into a skin tone based segmentation scheme that drives Kalman filtering based hand tracking, generating very few candidate windows. During classification, the extracted proposal regions are fed to a CNN for hand detection and type classification. Evaluation of the proposed hybrid approach on four well-known datasets of gestures and signing demonstrates its superior accuracy and computational efficiency over the R-CNN and its variants.
引用
收藏
页数:6
相关论文
共 45 条
[1]  
Amir A., 2017, PROC CVPR IEEE, P7243, DOI DOI 10.1109/CVPR.2017.781
[2]   Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions [J].
Bambach, Sven ;
Lee, Stefan ;
Crandall, David J. ;
Yu, Chen .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1949-1957
[3]  
Buehler P., 2008, P BRIT MACH VIS C
[4]  
Caputo M., 2012, MENSCH COMPUTER, P293
[5]  
Chen L. C., 2016, COMPUTING RES REPOSI
[6]   Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization [J].
Cui, Runpeng ;
Liu, Hu ;
Zhang, Changshui .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1610-1618
[7]   View-based interpretation of real-time optical flow for gesture recognition [J].
Cutler, R ;
Turk, M .
AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS, 1998, :416-421
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]  
Dreuw P, 2008, SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, P1115
[10]  
Escalera S, 2013, ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, P365