Efficient Search and Localization of Human Actions in Video Databases

被引：58

作者：

Shao, Ling ^{[1
,2
]}

Jones, Simon ^{[2
]}

Li, Xuelong ^{[3
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China

[2] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England

[3] Chinese Acad Sci, Xian Inst Opt & Precis Mech, State Key Lab Transient Opt & Photon, Ctr Opt Imagery Anal & Learning OPTIMAL, Xian 710119, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2014年 / 24卷 / 03期

基金：

中国国家自然科学基金; 英国工程与自然科学研究理事会;

关键词：

Human actions; relevance feedback; spatio-temporal localization; video retrieval; RELEVANCE FEEDBACK; RECOGNITION; RETRIEVAL;

D O I：

10.1109/TCSVT.2013.2276700

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As digital video databases grow, so grows the problem of effectively navigating through them. In this paper we propose a novel content-based video retrieval approach to searching such video databases, specifically those involving human actions, incorporating spatio-temporal localization. We outline a novel, highly efficient localization model that first performs temporal localization based on histograms of evenly spaced time-slices, then spatial localization based on histograms of a 2-D spatial grid. We further argue that our retrieval model, based on the aforementioned localization, followed by relevance ranking, results in a highly discriminative system, while remaining an order of magnitude faster than the current state-of-the-art method. We also show how relevance feedback can be applied to our localization and ranking algorithms. As a result, the presented system is more directly applicable to real-world problems than any prior content-based video retrieval system.

引用

页码：504 / 512

页数：9

共 43 条

[31] Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval [J].

Tao, DC ;

Tang, X ;

Li, XL ;

Wu, XD .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (07) :1088-1099

[32] Active Reranking for Web Image Search [J].

Tian, Xinmei ;

Tao, Dacheng ;

Hua, Xian-Sheng ;

Wu, Xiuqing .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (03) :805-820

[33]

Tong S, 2001, P 9 ACM INT C MULT A

[34]

Tuan Hue Thi, 2010, Proceedings 7th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2010), P204, DOI 10.1109/AVSS.2010.76

[35] Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification [J].

Wang, Meng ;

Hong, Richang ;

Li, Guangda ;

Zha, Zheng-Jun ;

Yan, Shuicheng ;

Chua, Tat-Seng .

IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (04) :975-985

[36] Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation [J].

Wang, Meng ;

Hua, Xian-Sheng ;

Tang, Jinhui ;

Hong, Richang .

IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (03) :465-476

[37] Unified Video Annotation via Multigraph Learning [J].

Wang, Meng ;

Hua, Xian-Sheng ;

Hong, Richang ;

Tang, Jinhui ;

Qi, Guo-Jun ;

Song, Yan .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2009, 19 (05) :733-746

[38]

Wu SD, 2011, IEEE I CONF COMP VIS, P1419, DOI 10.1109/ICCV.2011.6126397

[39]

Yamato J., 1992, Proceedings. 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.92CH3168-2), P379, DOI 10.1109/CVPR.1992.223161

[40]

Yan Rong., 2003, P 11 ACM INT C MULTI, P343

← 1 2 3 4 5 →