Actionness Ranking with Lattice Conditional Ordinal Random Fields

被引：29

作者：

Chen, Wei ^{[1
]}

Xiong, Caiming ^{[1
]}

Xu, Ran ^{[1
]}

Corso, Jason J. ^{[1
]}

机构：

[1] SUNY Buffalo, Comp Sci & Engn, Buffalo, NY 14260 USA

来源：

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2014年

关键词：

D O I：

10.1109/CVPR.2014.101

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action analysis in image and video has been attracting more and more attention in computer vision. Recognizing specific actions in video clips has been the main focus. We move in a new, more general direction in this paper and ask the critical fundamental question: what is action, how is action different from motion, and in a given image or video where is the action? We study the philosophical and visual characteristics of action, which lead us to define actionness: intentional bodily movement of biological agents ( people, animals). To solve the general problem, we propose the lattice conditional ordinal random field model that incorporates local evidence as well as neighboring order agreement. We implement the new model in the continuous domain and apply it to scoring actionness in both image and video datasets. Our experiments demonstrate not only that our new model can outperform the popular ranking SVM but also that indeed action is distinct from motion.

引用

页码：748 / 755

页数：8

共 40 条

[1]

Alexe B, 2010, PROC CVPR IEEE, P73, DOI 10.1109/CVPR.2010.5540226

[2]

[Anonymous], ECCV WORKSH

[3]

[Anonymous], 2008, Proceedings of NIPS'08

[4]

[Anonymous], 2009, ICCV

[5]

[Anonymous], 2009, BMVC 2009

[6]

[Anonymous], 2009, ICCV

[7] GENERALIZING THE HOUGH TRANSFORM TO DETECT ARBITRARY SHAPES [J].

BALLARD, DH .

PATTERN RECOGNITION, 1981, 13 (02) :111-122

[8]

Cao Z., 2007, P 24 INT C MACH LEAR, P129, DOI DOI 10.1145/1273496.1273513

[9] Human detection using oriented histograms of flow and appearance [J].

Dalal, Navneet ;

Triggs, Bill ;

Schmid, Cordelia .

COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 :428-441

[10] A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching [J].

Das, Pradipto ;

Xu, Chenliang ;

Doell, Richard F. ;

Corso, Jason J. .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2634-2641

← 1 2 3 4 →