Human Action Recognition in Unconstrained Videos by Explicit Motion Modeling

被引：61

作者：

Jiang, Yu-Gang ^{[1
]}

Dai, Qi ^{[1
]}

Liu, Wei ^{[2
]}

Xue, Xiangyang ^{[1
]}

Ngo, Chong-Wah ^{[3
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China

[2] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2015年 / 24卷 / 11期

基金：

美国国家科学基金会;

关键词：

Human action recognition; trajectory; motion; representation; reference points; camera motion; SUPER-VECTOR; CLASSIFICATION; HISTOGRAMS; DENSE;

D O I：

10.1109/TIP.2015.2456412

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human action recognition in unconstrained videos is a challenging problem with many applications. Most state-of-the-art approaches adopted the well-known bag-of-features representations, generated based on isolated local patches or patch trajectories, where motion patterns, such as object-object and object-background relationships are mostly discarded. In this paper, we propose a simple representation aiming at modeling these motion relationships. We adopt global and local reference points to explicitly characterize motion information, so that the final representation is more robust to camera movements, which widely exist in unconstrained videos. Our approach operates on the top of visual codewords generated on dense local patch trajectories, and therefore, does not require foreground-background separation, which is normally a critical and difficult step in modeling object relationships. Through an extensive set of experimental evaluations, we show that the proposed representation produces a very competitive performance on several challenging benchmark data sets. Further combining it with the standard bag-of-features or Fisher vector representations can lead to substantial improvements.

引用

页码：3781 / 3795

页数：15

共 68 条

[1]

[Anonymous], 2010, LECT NOTES COMPUT SC

[2]

[Anonymous], P 2009 IEEE C COMPUT, DOI DOI 10.1109/CVPR.2009.5206557

[3]

[Anonymous], P BMVC

[4]

[Anonymous], CVPR

[5] Object trajectory-based activity classification and recognition using hidden Markov models [J].

Bashir, Faisal I. ;

Khokhar, Ashfaq A. ;

Schonfeld, Dan .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (07) :1912-1919

[6]

Brendel W, 2011, IEEE I CONF COMP VIS, P778, DOI 10.1109/ICCV.2011.6126316

[7] Automatic panoramic image stitching using invariant features [J].

Brown, Matthew ;

Lowe, David G. .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 74 (01) :59-73

[8] Multi-View Super Vector for Action Recognition [J].

Cai, Zhuowei ;

Wang, Limin ;

Peng, Xiaojiang ;

Qiao, Yu .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :596-603

[9] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[10] Human detection using oriented histograms of flow and appearance [J].

Dalal, Navneet ;

Triggs, Bill ;

Schmid, Cordelia .

COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441

← 1 2 3 4 5 6 7 →