The spatial Laplacian and temporal energy pyramid representation for human action recognition using depth sequences

被引:44
作者
Ji, Xiaopeng [1 ,2 ,3 ]
Cheng, Jun [1 ,2 ,3 ]
Tao, Dapeng [4 ]
Wu, Xinyu [1 ,2 ,3 ]
Feng, Wei [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Prov Key Lab Robot & Intelligent Syst, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Shenzhen Coll Adv Technol, Beijing, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[4] Yunnan Univ, Sch Informat Sci & Engn, Kunming, Peoples R China
关键词
Action recognition; Depth maps; Spatial Laplacian pyramid; Temporal energy pyramid; Feature fusion; ENSEMBLE;
D O I
10.1016/j.knosys.2017.01.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth sequences are useful for action recognition since they are insensitive to illumination variation and provide geometric information. Many current action recognition methods are limited by being computationally expensive and requiring large-scale training data. Here we propose an effective method for human action recognition using depth sequences captured by depth cameras. A multi-resolution operation, the spatial Laplacian and temporal energy pyramid (SLTEP), decomposes the depth sequences into certain frequency bands in different space and time positions. A spatial aggregating and fusion scheme is applied to cluster the low-level features and concatenate two different feature types extracted from low and high frequency levels, respectively. We evaluate our approach on five public benchmark datasets (MSRAction3D, MSRGesture3D, MSRActionPairs, MSRDailyActivity3D, and NTU RGB+D) and demonstrate its advantages over existing methods and is likely to be highly useful for online applications. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:64 / 74
页数:11
相关论文
共 41 条
[1]   Human activity recognition using multi-features and multiple kernel learning [J].
Althloothi, Salah ;
Mahoor, Mohammad H. ;
Zhang, Xiao ;
Voyles, Richard M. .
PATTERN RECOGNITION, 2014, 47 (05) :1800-1812
[2]   A keypoints-based feature extraction method for iris recognition under variable image quality conditions [J].
Alvarez-Betancourt, Yuniol ;
Garcia-Silvente, Miguel .
KNOWLEDGE-BASED SYSTEMS, 2016, 92 :169-182
[3]  
[Anonymous], 2005, PROC CVPR IEEE
[4]  
[Anonymous], 2012, P ACM INT C MULT NAR, DOI DOI 10.1145/2393347.2396382
[5]  
Boureau Y. L., 2010, P ICML 10 P 27 INT C, P111
[6]   THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].
BURT, PJ ;
ADELSON, EH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540
[7]   Pose and illumination variable face recognition via sparse representation and illumination dictionary [J].
Cao, Feilong ;
Hu, Heping ;
Lu, Jing ;
Zhao, Jianwei ;
Zhou, Zhenghua ;
Wu, Jiao .
KNOWLEDGE-BASED SYSTEMS, 2016, 107 :117-128
[8]   Real-time human action recognition based on depth motion maps [J].
Chen, Chen ;
Liu, Kui ;
Kehtarnavaz, Nasser .
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2016, 12 (01) :155-163
[9]   Combining unsupervised learning and discrimination for 3D action recognition [J].
Chen, Guang ;
Clarke, Daniel ;
Giuliani, Manuel ;
Gaschler, Andre ;
Knoll, Alois .
SIGNAL PROCESSING, 2015, 110 :67-81
[10]   Discriminative local collaborative representation for online object tracking [J].
Chen, Si ;
Li, Shaozi ;
Ji, Rongrong ;
Yan, Yan ;
Zhu, Shunzhi .
KNOWLEDGE-BASED SYSTEMS, 2016, 100 :13-24