Spatio-temporal Semantic Features for Human Action Recognition

被引:0
作者
Liu, Jia [1 ,2 ]
Wang, Xiaonian [1 ]
Li, Tianyu [1 ]
Yang, Jie [1 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200030, Peoples R China
[2] Armed Police Forces, Coll Engn, Network & Informat Secur Key Lab, Xian 710086, Peoples R China
关键词
action recognition; spatio-temporal features; topic model; markov model;
D O I
10.3837/tiis.2012.10.011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most approaches to human action recognition is limited due to the use of simple action datasets under controlled environments or focus on excessively localized features without sufficiently exploring the spatio-temporal information. This paper proposed a framework for recognizing realistic human actions. Specifically, a new action representation is proposed based on computing a rich set of descriptors from keypoint trajectories. To obtain efficient and compact representations for actions, we develop a feature fusion method to combine spatial-temporal local motion descriptors by the movement of the camera which is detected by the distribution of spatio-temporal interest points in the clips. A new topic model called Markov Semantic Model is proposed for semantic feature selection which relies on the different kinds of dependencies between words produced by "syntactic" and "semantic" constraints. The informative features are selected collaboratively based on the different types of dependencies between words produced by short range and long range constraints. Building on the nonlinear SVMs, we validate this proposed hierarchical framework on several realistic action datasets.
引用
收藏
页码:2632 / 2649
页数:18
相关论文
共 41 条
[1]  
[Anonymous], 2005, Advances in Neural Information Processing Systems
[2]  
[Anonymous], P EUR C COMP VIS
[3]  
[Anonymous], 2007, Handbook of latent semantic analysis
[4]  
[Anonymous], BRIT MACH VIS C
[5]  
[Anonymous], P IEEE C COMP VIS PA
[6]  
[Anonymous], 2009, P BRIT MACH VIS C
[7]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]  
Bregonzio M, 2009, PROC CVPR IEEE, P1948, DOI 10.1109/CVPRW.2009.5206779
[10]   Efficient Action Spotting Based on a Spacetime Oriented Structure Representation [J].
Derpanis, Konstantinos G. ;
Sizintsev, Mikhail ;
Cannons, Kevin ;
Wildes, Richard P. .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :1990-1997