Cross-View Action Recognition via Transferable Dictionary Learning

被引:72
作者
Zheng, Jingjing [1 ]
Jiang, Zhuolin [3 ]
Chellappa, Rama [2 ]
机构
[1] Gen Elect Global Res, Niskayuna, NY 12309 USA
[2] Univ Maryland, Inst Adv Comp Studies, Ctr Automat Res, College Pk, MD 20742 USA
[3] Raytheon BBN Technol, Cambridge, MA 02138 USA
关键词
Dictionary learning; cross-view; action recognition; transfer learning; DISCRIMINATIVE DICTIONARY; K-SVD;
D O I
10.1109/TIP.2016.2548242
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discriminative appearance features are effective for recognizing actions in a fixed view, but may not generalize well to a new view. In this paper, we present two effective approaches to learn dictionaries for robust action recognition across views. In the first approach, we learn a set of view-specific dictionaries where each dictionary corresponds to one camera view. These dictionaries are learned simultaneously from the sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we additionally learn a common dictionary shared by different views to model view-shared features. This approach represents the videos in each view using a view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from the different views of the same action to have the similar sparse representations. The learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labeled videos exist in the target view. The extensive experiments using three public datasets demonstrate that the proposed approach outperforms recently developed approaches for cross-view action recognition.
引用
收藏
页码:2542 / 2556
页数:15
相关论文
共 54 条
[41]  
Singh Sanchit, 2010, Proceedings 7th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2010), P48, DOI 10.1109/AVSS.2010.63
[42]  
Tran D, 2008, LECT NOTES COMPUT SC, V5302, P548, DOI 10.1007/978-3-540-88682-2_42
[43]   Signal recovery from random measurements via orthogonal matching pursuit [J].
Tropp, Joel A. ;
Gilbert, Anna C. .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2007, 53 (12) :4655-4666
[44]  
Wang H., 2009, P BRIT MACH VIS C BM, V124, P11, DOI DOI 10.5244/C.23.124
[45]  
Weinland D, 2007, IEEE I CONF COMP VIS, P170
[46]   Free viewpoint action recognition using motion history volumes [J].
Weinland, Daniel ;
Ronfard, Remi ;
Boyer, Edmond .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2006, 104 (2-3) :249-257
[47]  
Weinland D, 2010, LECT NOTES COMPUT SC, V6313, P635
[48]  
Willems G, 2008, LECT NOTES COMPUT SC, V5303, P650, DOI 10.1007/978-3-540-88688-4_48
[49]   Robust Face Recognition via Sparse Representation [J].
Wright, John ;
Yang, Allen Y. ;
Ganesh, Arvind ;
Sastry, S. Shankar ;
Ma, Yi .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (02) :210-227
[50]  
Yan P., 2008, P CVPR, P1