Harnessing Lab Knowledge for Real-World Action Recognition

被引：31

作者：

Ma, Zhigang ^{[1
]}

Yang, Yi ^{[2
]}

Nie, Feiping ^{[3
]}

Sebe, Nicu ^{[4
]}

Yan, Shuicheng ^{[5
]}

Hauptmann, Alexander G. ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Univ Queensland, ITEE, Brisbane, Qld, Australia

[3] Univ Texas Arlington, Arlington, TX 76019 USA

[4] Univ Trento, Trento, Italy

[5] Natl Univ Singapore, Singapore 117548, Singapore

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2014年 / 109卷 / 1-2期

基金：

新加坡国家研究基金会; 美国国家科学基金会; 澳大利亚研究理事会;

关键词：

Action recognition; Lab to real-world; Transfer learning; General Schatten-p norm;

D O I：

10.1007/s11263-014-0717-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Much research on human action recognition has been oriented toward the performance gain on lab-collected datasets. Yet real-world videos are more diverse, with more complicated actions and often only a few of them are precisely labeled. Thus, recognizing actions from these videos is a tough mission. The paucity of labeled real-world videos motivates us to "borrow" strength from other resources. Specifically, considering that many lab datasets are available, we propose to harness lab datasets to facilitate the action recognition in real-world videos given that the lab and real-world datasets are related. As their action categories are usually inconsistent, we design a multi-task learning framework to jointly optimize the classifiers for both sides. The general Schatten -norm is exerted on the two classifiers to explore the shared knowledge between them. In this way, our framework is able to mine the shared knowledge between two datasets even if the two have different action categories, which is a major virtue of our method. The shared knowledge is further used to improve the action recognition in the real-world videos. Extensive experiments are performed on real-world datasets with promising results.

引用

页码：60 / 73

页数：14

共 42 条

[1]

[Anonymous], 2009, BMVC

[2]

[Anonymous], 2008, BRIT MACH VIS C

[3]

[Anonymous], 2008, IEEE C COMP VIS PATT

[4]

[Anonymous], 2003, PRACTICAL GUIDE SUPP

[5] Convex multi-task feature learning [J].

Argyriou, Andreas ;

Evgeniou, Theodoros ;

Pontil, Massimiliano .

MACHINE LEARNING, 2008, 73 (03) :243-272

[6]

Argyriou A, 2010, J MACH LEARN RES, V11, P935

[7]

Aytar Y, 2011, IEEE I CONF COMP VIS, P2252, DOI 10.1109/ICCV.2011.6126504

[8] Cross-Dataset Action Detection [J].

Cao, Liangliang ;

Liu, Zicheng ;

Huang, Thomas S. .

2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :1998-2005

[9] Learning a 3D Human Pose Distance Metric from Geometric Pose Descriptor [J].

Chen, Cheng ;

Zhuang, Yueting ;

Nie, Feiping ;

Yang, Yi ;

Wu, Fei ;

Xiao, Jun .

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (11) :1676-1689

[10]

Chen M. Y., 2009, CMUCS09161 CARN MELL

← 1 2 3 4 5 →