Action Recognition From Arbitrary Views Using Transferable Dictionary Learning

被引:46
作者
Zhang, Jingtian [1 ]
Shum, Hubert P. H. [1 ]
Han, Jungong [2 ]
Shao, Ling [3 ]
机构
[1] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
[2] Univ Lancaster, Sch Comp & Commun, Lancaster LA1 4YW, England
[3] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
英国工程与自然科学研究理事会;
关键词
Action recognition; 3D dense trajectories; view-invariance; transfer dictionary learning; RECOGNIZING ACTIONS; HISTOGRAMS; DENSE;
D O I
10.1109/TIP.2018.2836323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human action recognition is crucial to many practical applications, ranging from human-computer interaction to video surveillance. Most approaches either recognize the human action from a fixed view or require the knowledge of view angle, which is usually not available in practical applications. In this paper, we propose a novel end-to-end framework to jointly learn a view-invariance transfer dictionary and a view-invariant classifier. The result of the process is a dictionary that can project real-world 2D video into a view-invariant sparse representation, and a classifier to recognize actions with an arbitrary view. The main feature of our algorithm is the use of synthetic data to extract view-invariance between 3D and 2D videos during the pre-training phase. This guarantees the availability of training data, and removes the hassle of obtaining real-world videos in specific viewing angles. Additionally, for better describing the actions in 3D videos, we introduce a new feature set called the 3D dense trajectories to effectively encode extracted trajectory information on 3D videos. Experimental results on the IXMAS, N-UCLA, i3DPost and UWA3DII data sets show improvements over existing algorithms.
引用
收藏
页码:4709 / 4723
页数:15
相关论文
共 69 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]  
Ankerst M, 1999, LECT NOTES COMPUT SC, V1651, P207
[3]  
[Anonymous], 2014, ADV NEURAL INFORM PR
[4]  
[Anonymous], 2007, 2007 IEEE C COMP VIS
[5]  
[Anonymous], IEEE I CONF COMP VIS
[6]  
[Anonymous], 2001, Cmu Ri Tr 01-18
[7]  
[Anonymous], 2016, IMAGE VISION COMPUT, DOI DOI 10.1016/j.imavis.2016.01.001
[8]  
[Anonymous], P IEEE INT C AC SPEE
[9]  
[Anonymous], 2016, P ECCV
[10]  
Cohen I, 2003, IEEE INTERNATIONAL WORKSHOP ON ANALYSIS AND MODELING OF FACE AND GESTURES, P74