Mining Layered Grammar Rules for Action Recognition

被引:0
作者
Liang Wang
Yizhou Wang
Wen Gao
机构
[1] Harbin Institute of Technology,School of Computer Science and Technology
[2] Peking University,Nat’l Engineering Lab for Video Technology
[3] Peking University,Nat’l Engineering Lab for Video Technology and Key Lab. of Machine Perception (MoE), School of Electronics Engineering and Computer Science
来源
International Journal of Computer Vision | 2011年 / 93卷
关键词
Action recognition; Layered-grammar model; Action parse tree; Emerging pattern mining;
D O I
暂无
中图分类号
学科分类号
摘要
We propose a layered-grammar model to represent actions. Using this model, an action is represented by a set of grammar rules. The bottom layer of an action instance’s parse tree contains action primitives such as spatiotemporal (ST) interest points. At each layer above, we iteratively mine grammar rules and “super rules” that account for the high-order compositional feature structures. The grammar rules are categorized into three classes according to three different ST-relations of their action components, namely the strong relation, weak relation and stochastic relation. These ST-relations characterize different action styles (degree of stiffness), and they are pursued in terms of grammar rules for the purpose of action recognition. By adopting the Emerging Pattern (EP) mining algorithm for relation pursuit, the learned production rules are statistically significant and discriminative. Using the learned rules, the parse tree of an action video is constructed by combining a bottom-up rule detection step and a top-down ambiguous rule pruning step. An action instance is recognized based on the discriminative configurations generated by the production rules of its parse tree. Experiments confirm that by incorporating the high-order feature statistics, the proposed method largely improves the recognition performance over the bag-of-words models.
引用
收藏
页码:162 / 182
页数:20
相关论文
共 33 条
  • [1] Alhammady H.(2006)Using emerging patterns to construct weighted decision trees IEEE Transactions on Knowledge and Data Engineering 18 865-876
  • [2] Ramamohanarao K.(1994)Actions and events in interval temporal logic Journal of Logic and Computation 4 531-579
  • [3] Allen J. F.(1999)CAEP: classification by aggregating emerging patterns Discovery Science 1721 737-747
  • [4] Ferguson G.(2000)Recognition of visual activities and interactions by stochastic parsing IEEE Transactions on Pattern Analysis and Machine Intelligence 22 852-872
  • [5] Dong G.(2008)Robust object detection with interleaved categorization and segmentation International Journal of Computer Vision 77 259-289
  • [6] Zhang X.(1995)Segmentation of range images as the search for geometric parametric models International Journal of Computer Vision 14 253-277
  • [7] Wong L.(2009)Semantic event representation and recognition using syntactic attribute graph grammar Pattern Recognition Letters 30 180-186
  • [8] Li J.(2004)Distinctive image features from scale-invariant keypoints International Journal of Computer Vision 60 91-110
  • [9] Ivanov Y. A.(2008)Unsupervised learning of human action categories using spatial-temporal words International Journal of Computer Vision 79 299-318
  • [10] Bobick A. F.(2007)A thousand words in a scene IEEE Transactions on Pattern Analysis and Machine Intelligence 29 1575-1589