Comparing random forest approaches to segmenting and classifying gestures

被引:34
作者
Joshi, Ajjen [1 ]
Monnier, Camille [2 ]
Betke, Margrit [1 ]
Sclaroff, Stan [1 ]
机构
[1] Boston Univ, Dept Comp Sci, 111 Cummington St, Boston, MA 02215 USA
[2] Charles River Analyt, Cambridge, MA 02138 USA
基金
美国国家科学基金会;
关键词
Gesture spotting; Gesture classification; Random forest classifier; RECOGNITION;
D O I
10.1016/j.imavis.2016.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A complete gesture recognition system should localize and classify each gesture from a given gesture vocabulary, within a continuous video stream. In this work, we compare two approaches: a method that performs the tasks of temporal segmentation and classification simultaneously with another that performs the tasks sequentially. The first method trains a single random forest model to recognize gestures from a given vocabulary, as presented in a training dataset of video plus 3D body joint locations, as well as out-of-vocabulary (non-gesture) instances. The second method employs a cascaded approach, training a binary random forest model to distinguish gestures from background and a multi-class random forest model to classify segmented gestures. Given a test input video stream, both frameworks are applied using sliding windows at multiple temporal scales. We evaluated our formulation in segmenting and recognizing gestures from two different benchmark datasets: the NATOPS dataset of 9600 gesture instances from a vocabulary of 24 aircraft handling signals, and the ChaLearn dataset of 7754 gesture instances from a vocabulary of 20 Italian communication gestures. The performance of our method compares favorably with state-of-the-art methods that employ Hidden Markov Models or Hidden Conditional Random Fields on the NATOPS dataset. We conclude with a discussion of the advantages of using our model for the task of gesture recognition and segmentation, and outline weaknesses which need to be addressed in the future. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:86 / 95
页数:10
相关论文
共 35 条
[1]  
Alon Jonathan., 2005, Seventh IEEE Workshops on Application of Computer Vision, V2, P254
[2]  
[Anonymous], 2011, COMPUTER ANIMATION A
[3]  
[Anonymous], ARXIV150601911
[4]  
Bosch A, 2007, IEEE I CONF COMP VIS, P1863
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Camgoz NecatiCihan., 2014, Springer Int Publ, P579
[7]  
Cooper H, 2007, LECT NOTES COMPUT SC, V4796, P88
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]  
Demirdjian David., 2009, Proceedings of the 2009 International Conference on Multimodal Inter-faces, P293
[10]  
Escalera S., 2014, P 2014 IEEE EUR C CO