Super Normal Vector for Human Activity Recognition with Depth Cameras

被引：112

作者：

Yang, Xiaodong ^{[1
]}

Tian, YingLi ^{[2
,3
]}

机构：

[1] NVIDIA Res, Santa Clara, CA 95050 USA

[2] CUNY City Coll, Dept Elect Engn, New York, NY 10031 USA

[3] CUNY, Grad Ctr, New York, NY 10031 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2017年 / 39卷 / 05期

基金：

美国国家科学基金会;

关键词：

Human activity recognition; depth camera; feature representation; spatio-temporal information; IMAGE CLASSIFICATION; ACTIONLET ENSEMBLE; MOTION; POSE;

D O I：

10.1109/TPAMI.2016.2565479

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The advent of cost-effectiveness and easy-operation depth cameras has facilitated a variety of visual recognition tasks including human activity recognition. This paper presents a novel framework for recognizing human activities from video sequences captured by depth cameras. We extend the surface normal to polynormal by assembling local neighboring hypersurface normals from a depth sequence to jointly characterize local motion and shape information. We then propose a general scheme of super normal vector (SNV) to aggregate the low-level polynormals into a discriminative representation, which can be viewed as a simplified version of the Fisher kernel representation. In order to globally capture the spatial layout and temporal order, an adaptive spatio-temporal pyramid is introduced to subdivide a depth video into a set of space-time cells. In the extensive experiments, the proposed approach achieves superior performance to the state-of-the-art methods on the four public benchmark datasets, i.e., MSRAction3D, MSRDailyActivity3D, MSRGesture3D, and MSRActionPairs3D.

引用

页码：1028 / 1039

页数：12

共 53 条

[1]

[Anonymous], 2004, P ECCV WORKSH STAT L

[2]

[Anonymous], COMP VIS ACCV 2012

[3]

[Anonymous], 2013, P 23 INT JOINT C ART

[4]

[Anonymous], 2011, P 28 INT C MACHINE L

[5]

[Anonymous], 2016, P IEEE C COMP VIS PA

[6]

[Anonymous], 2012, P ACM INT C MULT NAR, DOI DOI 10.1145/2393347.2396382

[7] Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts [J].

Bhattacharya, Subhabrata ;

Kalaych, Mahdi M. ;

Sukthankar, Rahul ;

Shah, Mubarak .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2243-2250

[8] Learning Mid-Level Features For Recognition [J].

Boureau, Y-Lan ;

Bach, Francis ;

LeCun, Yann ;

Ponce, Jean .

2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :2559-2566

[9]

Dollar P., 2005, Proceedings. 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS) (IEEE Cat. No. 05EX1178), P65

[10] Efficient Pose-Based Action Recognition [J].

Eweiwi, Abdalrahman ;

Cheema, Muhammed S. ;

Bauckhage, Christian ;

Gall, Juergen .

COMPUTER VISION - ACCV 2014, PT V, 2015, 9007 :428-443

← 1 2 3 4 5 6 →