Action Recognition by Time Series of Retinotopic Appearance and Motion Features

被引：20

作者：

Barrett, Daniel Paul ^{[1
]}

Siskind, Jeffrey Mark ^{[1
]}

机构：

[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2016年 / 26卷 / 12期

基金：

美国国家科学基金会;

关键词：

Hidden Markov model (HMM); object detection; tracking; video action recognition; MODELS;

D O I：

10.1109/TCSVT.2015.2502839

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present a method for recognizing and localizing actions in video by the sequence of changing appearance and motion of the participants. Appearance is modeled by histogram of oriented gradients object detectors, while motion is modeled by optical-flow motion-pattern detectors. Sequencing is modeled by a hidden Markov model (HMM) whose output models are these appearance and motion detectors. The HMM and associated detectors are simultaneously trained, learning the sequence of detectors that match the most distinctive temporal subsequences of the action represented in the training data. Training uses both positive and negative samples of a given action class and is accomplished without the need for annotation of the correspondence between training video frames and the state-conditioned detectors, by minimizing a discriminative cost function through gradient descent. Trained models are used to perform recognition and localization by simultaneous detection, tracking, and action recognition. In contrast to many prior methods, our approach learns intuitively meaningful models that represent action as a sequence of retinotopic models. We demonstrate such by rendering these models on unseen test video. This method was found to perform competitively on three standard datasets, Weizmann, KTH, and UCF Sports, as well as on the video from the Defence Advanced Research Project Agency (DARPA) Mind's Eye program and a newly filmed dataset.

引用

页码：2250 / 2263

页数：14

共 51 条

[1]

[Anonymous], IEEE I CONF COMP VIS

[2]

[Anonymous], 1989, Mathematical Programming: recent developments and applications

[3]

[Anonymous], COLLECTING ANNOTATIN

[4]

Banerjee P, 2014, LECT NOTES COMPUT SC, V8690, P711, DOI 10.1007/978-3-319-10605-2_46

[5]

Barbu Andrei., 2012, Advances in Cognitive Systems, V2, P203

[6] STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].

BAUM, LE ;

PETRIE, T .

ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&

[7] Recognize Human Activities from Partially Observed Videos [J].

Cao, Yu ;

Barrett, Daniel ;

Barbu, Andrei ;

Narayanaswamy, Siddharth ;

Yu, Haonan ;

Michaux, Aaron ;

Lin, Yuewei ;

Dickinson, Sven ;

Siskind, Jeffrey Mark ;

Wang, Song .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2658-2665

[8] Kernel sparse representation for time series classification [J].

Chen, Zhihua ;

Zuo, Wangmeng ;

Hu, Qinghua ;

Lin, Liang .

INFORMATION SCIENCES, 2015, 292 :15-26

[9] On the algorithmic implementation of multiclass kernel-based vector machines [J].

Crammer, K ;

Singer, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :265-292

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 6 →