Silhouette Analysis for Human Action Recognition Based on Supervised Temporal t-SNE and Incremental Learning

被引:59
作者
Cheng, Jian [1 ,2 ]
Liu, Haijun [1 ]
Wang, Feng [1 ]
Li, Hongsheng [1 ]
Zhu, Ce [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Elect Engn, Chengdu 611731, Peoples R China
[2] Univ Elect Sci & Technol China, Ctr Robot, Chengdu 611731, Peoples R China
基金
美国国家科学基金会;
关键词
Human action recognition; manifold learning; stochastic neighbor embedding; incremental learning; NONLINEAR DIMENSIONALITY REDUCTION; MANIFOLDS; EXTENSIONS; EIGENMAPS; ISOMAP; MODELS;
D O I
10.1109/TIP.2015.2441634
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper develops a human action recognition method for human silhouette sequences based on supervised temporal t-stochastic neighbor embedding (ST-tSNE) and incremental learning. Inspired by the SNE and its variants, ST-tSNE is proposed to learn the underlying relationship between action frames in a manifold, where the class label information and temporal information are introduced to well represent those frames from the same action class. As to the incremental learning, an important step for action recognition, we introduce three methods to perform the low-dimensional embedding of new data. Two of them are motivated by local methods, locally linear embedding and locality preserving projection. Those two techniques are proposed to learn explicit linear representations following the local neighbor relationship, and their effectiveness is investigated for preserving the intrinsic action structure. The rest one is based on manifold-oriented stochastic neighbor projection to find a linear projection from high-dimensional to low-dimensional space capturing the underlying pattern manifold. Extensive experimental results and comparisons with the state-of-the-art methods demonstrate the effectiveness and robustness of the proposed ST-tSNE and incremental learning methods in the human action silhouette analysis.
引用
收藏
页码:3203 / 3217
页数:15
相关论文
共 55 条
[1]  
[Anonymous], 2000, Multidimensional scaling
[2]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[3]  
Bengio Y, 2004, ADV NEUR IN, V16, P177
[4]  
Blackburn J, 2007, LECT NOTES COMPUT SC, V4814, P285
[5]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[6]  
Charalampous K., 2014, PATTERN ANAL APPL, P1
[7]   Flexible background mixture models for foreground segmentation [J].
Cheng, Jian ;
Yang, Jie ;
Zhou, Yue ;
Cui, Yingying .
IMAGE AND VISION COMPUTING, 2006, 24 (05) :473-482
[8]   Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding [J].
Cheng, Jian ;
Liu, Haijun ;
Li, Hongsheng .
MACHINE VISION AND APPLICATIONS, 2014, 25 (04) :1007-1018
[9]   A Monte Carlo experiment to analyze the curse of dimensionality in estimating random coefficients models with a full variance-covariance matrix [J].
Cherchi, Elisabetta ;
Angelo Guevara, Cristian .
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2012, 46 (02) :321-332
[10]  
Chin-Hsien Fang, 2009, Computer Vision - ACCV 2009. 9th Asian Conference on Computer Vision. Revised Selected Papers, P98