Learning hierarchical 3D kernel descriptors for RGB-D action recognition

被引：16

作者：

Kong, Yu ^{[1
]}

Satarboroujeni, Behnam ^{[1
]}

Fu, Yun ^{[1
,2
]}

机构：

[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA

[2] Northeastern Univ, Coll Comp & Informat Sci, Boston, MA 02115 USA

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2016年 / 144卷

基金：

美国国家科学基金会;

关键词：

RGB-D action; Action recognition; Kernel descriptor;

D O I：

10.1016/j.cviu.2015.10.001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human action recognition is an important and challenging task due to intra-class variation and complexity of actions which is caused by diverse style and duration in performed action. Previous works mostly concentrate on either depth or RGB data to build an understanding about the shape and movement cues in videos but fail to simultaneously utilize rich information in both channels. In this paper we study the problem of RGB-D action recognition from both RGB and depth sequences using kernel descriptors. Kernel descriptors provide an unified and elegant framework to turn pixel-level attributes into descriptive information about the performed actions in video. We show how using simple kernel descriptors over pixel attributes in video sequences achieves a great success compared to the state-of-the-art complex methods. Following the success of kernel descriptors (Bo, et al., 2010) on object recognition task, we put forward the claim that using 3D kernel descriptors could be an effective way to project the low-level features on 3D patches into a powerful structure which can effectively describe the scene. We build our system upon the 3D Gradient kernel descriptor and construct a hierarchical framework by employing efficient match kernel (EMK) (Bo, and Sminchisescu, 2009) and hierarchical kernel descriptors (HKD) as higher levels to abstract the mid-level features for classification. Through extensive experiments we demonstrate the proposed approach achieves superior performance on four standard RGB-D sequences benchmarks. (C) 2015 Elsevier Inc. All rights reserved.

引用

页码：14 / 23

页数：10

共 42 条

[1] Evolutionary joint selection to improve human action recognition with RGB-D devices [J].

Andre Chaaraoui, Alexandros ;

Ramon Padilla-Lopez, Jose ;

Climent-Perez, Pau ;

Florez-Revuelta, Francisco .

EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (03) :786-794

[2]

[Anonymous], 2013, P 23 INT JOINT C ART

[3]

[Anonymous], 1999, Tech. Rep.

[4]

[Anonymous], 2012, P ACM INT C MULT NAR, DOI DOI 10.1145/2393347.2396382

[5]

[Anonymous], P ICCV

[6]

Bo L., 2009, P NIPS

[7]

Bo L., 2010, P NIPS

[8]

Chen L., 2014, P CVPR

[9] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[10]

Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899

← 1 2 3 4 5 →