Cross-View Action Recognition via Transferable Dictionary Learning

被引：72

作者：

Zheng, Jingjing ^{[1
]}

Jiang, Zhuolin ^{[3
]}

Chellappa, Rama ^{[2
]}

机构：

[1] Gen Elect Global Res, Niskayuna, NY 12309 USA

[2] Univ Maryland, Inst Adv Comp Studies, Ctr Automat Res, College Pk, MD 20742 USA

[3] Raytheon BBN Technol, Cambridge, MA 02138 USA

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2016年 / 25卷 / 06期

关键词：

Dictionary learning; cross-view; action recognition; transfer learning; DISCRIMINATIVE DICTIONARY; K-SVD;

D O I：

10.1109/TIP.2016.2548242

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Discriminative appearance features are effective for recognizing actions in a fixed view, but may not generalize well to a new view. In this paper, we present two effective approaches to learn dictionaries for robust action recognition across views. In the first approach, we learn a set of view-specific dictionaries where each dictionary corresponds to one camera view. These dictionaries are learned simultaneously from the sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we additionally learn a common dictionary shared by different views to model view-shared features. This approach represents the videos in each view using a view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from the different views of the same action to have the similar sparse representations. The learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labeled videos exist in the target view. The extensive experiments using three public datasets demonstrate that the proposed approach outperforms recently developed approaches for cross-view action recognition.

引用

页码：2542 / 2556

页数：15

共 54 条

[1] K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].