Motion keypoint trajectory and covariance descriptor for human action recognition

被引：26

作者：

Yi, Yun ^{[1
,2
,3
]}

Wang, Hanli ^{[1
,2
]}

机构：

[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China

[2] Tongji Univ, Key Lab Embedded Syst & Serv Comp, Minist Educ, Shanghai 200092, Peoples R China

[3] Gannan Normal Univ, Dept Math & Comp Sci, Ganzhou 341000, Peoples R China

来源：

VISUAL COMPUTER | 2018年 / 34卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Human action recognition; Motion keypoint trajectory; Optical flow rectification; Trajectory-based covariance descriptor; REGION COVARIANCE; HISTOGRAMS; DENSE;

D O I：

10.1007/s00371-016-1345-6

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Human action recognition from videos is a challenging task in computer vision. In recent years, histogram-based descriptors that are calculated along dense trajectories have shown promising results for human action recognition, but they usually ignore motion information of the tracking points, and the relationship between different motion variables is not well utilized. To address this issue, we propose a motion keypoint trajectory (MKT) approach and a trajectory-based covariance (TBC) descriptor, which is calculated along the motion keypoint trajectories. The proposed MKT approach tracks motion keypoints at multiple spatial scales and employs an optical flow rectification algorithm to reduce the influence of camera motions and thus achieves better performance than the improved dense trajectory (IDT) approach well known in the literature. In particular, MKT is faster than IDT, because MKT does not need to use human detection and extracts fewer trajectories than IDT. Furthermore, the TBC descriptor outperforms the classical histogram-based descriptors, such as the Histogram of Oriented Gradient, Histogram of Optical Flow and Motion Boundary Histogram. Experimental results on three challenging datasets (i.e., Olympic Sports, HMDB51 and UCF50) demonstrate that our approach is able to achieve better recognition performances than a number of state-of-the-art approaches.

引用

页码：391 / 403

页数：13

共 41 条

[1]

[Anonymous], 2003, Geodesy-the Challenge of the 3rd Millennium, DOI [10.1007/978-3-662-05296-9_31, DOI 10.1007/978-3-662-05296-9_31]

[2] Log-euclidean metrics for fast and simple calculus on diffusion tensors [J].

Arsigny, Vincent ;

Fillard, Pierre ;

Pennec, Xavier ;

Ayache, Nicholas .

MAGNETIC RESONANCE IN MEDICINE, 2006, 56 (02) :411-421

[3] Speeded-Up Robust Features (SURF) [J].

Bay, Herbert ;

Ess, Andreas ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359

[4]

Bilinski P, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P2140

[5] Video-Based Human Behavior Understanding: A Survey [J].

Borges, Paulo Vinicius Koerich ;

Conci, Nicola ;

Cavallaro, Andrea .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (11) :1993-2008

[6]

Brendel W, 2011, IEEE I CONF COMP VIS, P778, DOI 10.1109/ICCV.2011.6126316

[7] Exploring Temporal Structure of Trajectory Components for Action Recognition [J].

Cheng, Guangchun ;

Huang, Yan ;

Wan, Yiwen ;

Buckles, Bill P. .

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2015, 30 (02) :99-119

[8] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[9] Human detection using oriented histograms of flow and appearance [J].

Dalal, Navneet ;

Triggs, Bill ;

Schmid, Cordelia .

COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441

[10] A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector [J].

Das Dawn, Debapratim ;

Shaikh, Soharab Hossain .

VISUAL COMPUTER, 2016, 32 (03) :289-306

← 1 2 3 4 5 →