EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis

被引:11
作者
Du Tran [1 ]
Torresani, Lorenzo [1 ]
机构
[1] Dartmouth Coll, Dept Comp Sci, 6211 Sudikoff Lab, Hanover, NH 03755 USA
关键词
Action recognition; Action similarity labeling; Video representation; Mid-level features;
D O I
10.1007/s11263-016-0905-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present EXMOVES-learned exemplar-based features for efficient recognition and analysis of actions in videos. The entries in our descriptor are produced by evaluating a set of movement classifiers over spatial-temporal volumes of the input video sequences. Each movement classifier is a simple exemplar-SVM trained on low-level features, i.e., an SVM learned using a single annotated positive space-time volume and a large number of unannotated videos. Our representation offers several advantages. First, since our mid-level features are learned from individual video exemplars, they require minimal amount of supervision. Second, we show that simple linear classification models trained on our global video descriptor yield action recognition accuracy approaching the state-of-the-art but at orders of magnitude lower cost, since at test-time no sliding window is necessary and linear models are efficient to train and test. This enables scalable action recognition, i.e., efficient classification of a large number of actions even in massive video databases. Third, we show the generality of our approach by training our mid-level descriptors from different low-level features and testing them on two distinct video analysis tasks: human activity recognition as well as action similarity labeling. Experiments on large-scale benchmarks demonstrate the accuracy and efficiency of our proposed method on both these tasks.
引用
收藏
页码:239 / 253
页数:15
相关论文
共 50 条
[1]  
[Anonymous], 2004, Int. J. Comput. Vis., DOI [DOI 10.1023/B:VISI.0000029664.99615.94, 10.1023/B:VISI.0000029664.99615.94]
[2]  
[Anonymous], P C COMP VIS PATT RE
[3]  
[Anonymous], 2011, INT C COMP VIS
[4]  
[Anonymous], EUR C COMP VIS
[5]  
[Anonymous], 2009, PAMI
[6]  
Blank M, 2005, IEEE I CONF COMP VIS, P1395
[7]  
Chapelle O., 2008, P AM STAT ASS
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]  
Dalal N., 2005, IEEE C COMP VIS PATT
[10]   Human detection using oriented histograms of flow and appearance [J].
Dalal, Navneet ;
Triggs, Bill ;
Schmid, Cordelia .
COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441