An approach to pose-based action recognition

被引:246
作者
Wang, Chunyu [1 ]
Wang, Yizhou [1 ]
Yuille, Alan L. [2 ]
机构
[1] Peking Univ, Schl EECS, Key Lab Machine Percept MoE, Natl Engn Lab Video Technol, Beijing 100871, Peoples R China
[2] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA USA
来源
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2013年
关键词
D O I
10.1109/CVPR.2013.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address action recognition in videos by modeling the spatial-temporal structures of human poses. We start by improving a state of the art method for estimating human joint locations from videos. More precisely, we obtain the K-best estimations output by the existing method and incorporate additional segmentation cues and temporal constraints to select the "best" one. Then we group the estimated joints into five body parts (e. g. the left arm) and apply data mining techniques to obtain a representation for the spatial-temporal structures of human actions. This representation captures the spatial configurations of body parts in one frame (by spatial-part-sets) as well as the body part movements(by temporal-part-sets) which are characteristic of human actions. It is interpretable, compact, and also robust to errors on joint estimations. Experimental results first show that our approach is able to localize body joints more accurately than existing methods. Next we show that it outperforms state of the art action recognizers on the UCF sport, the Keck Gesture and the MSR-Action3D datasets.
引用
收藏
页码:915 / 922
页数:8
相关论文
共 27 条
[11]  
Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
[12]  
Ikizler N, 2007, LECT NOTES COMPUT SC, V4814, P271
[13]   Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees [J].
Jiang, Zhuolin ;
Lin, Zhe ;
Davis, Larry S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (03) :533-547
[14]   Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition [J].
Kovashka, Adriana ;
Grauman, Kristen .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :2046-2053
[15]   On space-time interest points [J].
Laptev, I .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2005, 64 (2-3) :107-123
[16]   Application on Integration Technology of Visualized Hierarchical Information [J].
Li, Weibo ;
He, Yang .
2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL I, 2010, :9-12
[17]  
Liang Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3257, DOI 10.1109/CVPR.2011.5995377
[18]  
Maji S., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3177, DOI 10.1109/CVPR.2011.5995631
[19]   Action MACH - A spatio-temporal maximum average correlation height filter for action recognition [J].
Rodriguez, Mikel D. ;
Ahmed, Javed ;
Shah, Mubarak .
2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, :3001-+
[20]  
Sadanand S, 2012, PROC CVPR IEEE, P1234, DOI 10.1109/CVPR.2012.6247806