Fusing R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {R}}$$\end{document} Features and Local Features with Context-Aware Kernels for Action Recognition

被引:0
作者
Chunfeng Yuan
Baoxin Wu
Xi Li
Weiming Hu
Stephen Maybank
Fangshi Wang
机构
[1] CAS,National Laboratory of Pattern Recognition, Institute of Automation
[2] Zhejiang University,College of Computer Science and Technology
[3] Birkbeck College,Department of Computer Science and Information Systems
[4] Beijing Jiaotong University,School of Software Engineering
关键词
Action recognition; Spatio-temporal interest points ; 3D ; transform; Hypergraph; Context-aware kernel;
D O I
10.1007/s11263-015-0867-0
中图分类号
学科分类号
摘要
The performance of action recognition in video sequences depends significantly on the representation of actions and the similarity measurement between the representations. In this paper, we combine two kinds of features extracted from the spatio-temporal interest points with context-aware kernels for action recognition. For the action representation, local cuboid features extracted around interest points are very popular using a Bag of Visual Words (BOVW) model. Such representations, however, ignore potentially valuable information about the global spatio-temporal distribution of interest points. We propose a new global feature to capture the detailed geometrical distribution of interest points. It is calculated by using the 3D R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {R}}$$\end{document} transform which is defined as an extended 3D discrete Radon transform, followed by the application of a two-directional two-dimensional principal component analysis. For the similarity measurement, we model a video set as an optimized probabilistic hypergraph and propose a context-aware kernel to measure high order relationships among videos. The context-aware kernel is more robust to the noise and outliers in the data than the traditional context-free kernel which just considers the pairwise relationships between videos. The hyperedges of the hypergraph are constructed based on a learnt Mahalanobis distance metric. Any disturbing information from other classes is excluded from each hyperedge. Finally, a multiple kernel learning algorithm is designed by integrating the l2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_{2}$$\end{document} norm regularization into a linear SVM classifier to fuse the R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {R}}$$\end{document} feature and the BOVW representation for action recognition. Experimental results on several datasets demonstrate the effectiveness of the proposed approach for action recognition.
引用
收藏
页码:151 / 171
页数:20
相关论文
共 87 条
  • [1] Armijo L(1966)Minimization of functions having Lipschitz continuous first partial derivatives Pacific Journal of Mathematics 16 1-3
  • [2] Blank M(2007)Actions as space-time shapes IEEE Transactions on Pattern Analysis and Machine Intelligence 29 2247-2253
  • [3] Gorelick L(2002)Choosing multiple parameters for support vector machines Machine Learning 46 131-159
  • [4] Shechtman E(2013)Exploring the trade-off between accuracy and observational latency in action recognition International Journal of Computer Vision 101 420-436
  • [5] Irani M(2014)Activity representation with motion hierarchies International Journal of Computer Vision 107 219-238
  • [6] Basri R(2011)Lp-norm multiple kernel learning The Journal of Machine Learning Research 12 953-997
  • [7] Chapelle O(2014)Continuous action recognition based on sequence alignment International Journal of Computer Vision 112 90-114
  • [8] Vapnik V(2005)On space-time interest points International Journal of Computer Vision 64 107-123
  • [9] Bousquet O(2014)Context-aware hypergraph construction for robust spectral clustering IEEE Transactions on Knowledge and Data Engineering 26 2588-2597
  • [10] Mukherjee S(2012)Salient object detection using content-sensitive hypergraph representation and partitioning Pattern Recognition 45 3886-3901