Hierarchical and Spatio-Temporal Sparse Representation for Human Action Recognition

被引:15
作者
Tian, Yi [1 ]
Kong, Yu [2 ]
Ruan, Qiuqi [1 ]
An, Gaoyun [1 ]
Fu, Yun [2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
[2] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
基金
中国国家自然科学基金;
关键词
Action Recognition; locally consistent group sparse coding; hierarchical sparse coding scheme; absolute and relative location models; IMAGE CLASSIFICATION; MOTION; FEATURES; VECTOR; ROBUST;
D O I
10.1109/TIP.2017.2788196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a novel two-layer video representation for human action recognition employing hierarchical group sparse encoding technique and spatio-temporal structure. In the first layer, a new sparse encoding method named locally consistent group sparse coding (LCGSC) is proposed to make full use of motion and appearance information of local features. LCGSC method not only encodes global layouts of features within the same video-level groups, but also captures local correlations between them, which obtains expressive sparse representations of video sequences. Meanwhile, two kinds of efficient location estimation models, namely an absolute location model and a relative location model, are developed to incorporate spatio-temporal structure into LCGSC representations. In the second layer, action-level group is established, where a hierarchical LCGSC encoding scheme is applied to describe videos at different levels of abstractions. On the one hand, the new layer captures higher order dependency between video sequences; on the other hand, it takes label information into consideration to improve discrimination of videos' representations. The superiorities of our hierarchical framework are demonstrated on several challenging datasets.
引用
收藏
页码:1748 / 1762
页数:15
相关论文
共 63 条
[1]  
[Anonymous], CORR
[2]  
[Anonymous], 2014, ADV NEURAL INFORM PR
[3]  
[Anonymous], 2014, P SIAM INT C DAT MIN
[4]  
[Anonymous], P BRIT MACH VIS C
[5]  
[Anonymous], 2014, ADV COMPUT VIS PATTE, DOI 10.1007/978-3-319-09396-3_9
[6]  
[Anonymous], IEEE C COMP VIS PATT
[7]  
[Anonymous], P EUR C COMPUT VIS
[8]  
[Anonymous], 2010, P 27 INT C INT C MAC
[9]  
Bengio S., 2009, Advances in Neural Information Processing Systems, V22, P82
[10]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267