From Actemes to Action: A Strongly-supervised Representation for Detailed Action Understanding

被引：239

作者：

Zhang, Weiyu ^{[1
]}

Zhu, Menglong ^{[1
]}

Derpanis, Konstantinos G. ^{[2
]}

机构：

[1] Univ Penn, Grasp Lab, Philadelphia, PA 19104 USA

[2] Ryerson Univ, Dept Comp Sci, Toronto, ON, Canada

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2013年

关键词：

D O I：

10.1109/ICCV.2013.280

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a novel approach for analyzing human actions in non-scripted, unconstrained video settings based on volumetric, x-y-t, patch classifiers, termed actemes. Unlike previous action-related work, the discovery of patch classifiers is posed as a strongly-supervised process. Specifically, keypoint labels (e. g., position) across spacetime are used in a data-driven training process to discover patches that are highly clustered in the spacetime keypoint configuration space. To support this process, a new human action dataset consisting of challenging consumer videos is introduced, where notably the action label, the 2D position of a set of keypoints and their visibilities are provided for each video frame. On a novel input video, each acteme is used in a sliding volume scheme to yield a set of sparse, non-overlapping detections. These detections provide the intermediate substrate for segmenting out the action. For action classification, the proposed representation shows significant improvement over state-of-the-art low-level features, while providing spatiotemporal localization as additional output. This output sheds further light into detailed action understanding.

引用

页码：2248 / 2255

页数：8

共 31 条

[1]

[Anonymous], 2006, CVPR

[2]

[Anonymous], 2009, BMVC

[3]

[Anonymous], 2012, CVPR

[4]

[Anonymous], 2003, ICCV

[5]

[Anonymous], 2011, CVPR

[6]

[Anonymous], 2010, ECCV

[7]

[Anonymous], 2008, BMVC 2008 19 BRIT MA

[8]

[Anonymous], IJCV

[9]

[Anonymous], 2010, ECCV

[10]

[Anonymous], 2011, CVPR

← 1 2 3 4 →