Efficient Search and Localization of Human Actions in Video Databases

被引:58
作者
Shao, Ling [1 ,2 ]
Jones, Simon [2 ]
Li, Xuelong [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
[3] Chinese Acad Sci, Xian Inst Opt & Precis Mech, State Key Lab Transient Opt & Photon, Ctr Opt Imagery Anal & Learning OPTIMAL, Xian 710119, Shaanxi, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Human actions; relevance feedback; spatio-temporal localization; video retrieval; RELEVANCE FEEDBACK; RECOGNITION; RETRIEVAL;
D O I
10.1109/TCSVT.2013.2276700
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As digital video databases grow, so grows the problem of effectively navigating through them. In this paper we propose a novel content-based video retrieval approach to searching such video databases, specifically those involving human actions, incorporating spatio-temporal localization. We outline a novel, highly efficient localization model that first performs temporal localization based on histograms of evenly spaced time-slices, then spatial localization based on histograms of a 2-D spatial grid. We further argue that our retrieval model, based on the aforementioned localization, followed by relevance ranking, results in a highly discriminative system, while remaining an order of magnitude faster than the current state-of-the-art method. We also show how relevance feedback can be applied to our localization and ranking algorithms. As a result, the presented system is more directly applicable to real-world problems than any prior content-based video retrieval system.
引用
收藏
页码:504 / 512
页数:9
相关论文
共 43 条
[31]   Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval [J].
Tao, DC ;
Tang, X ;
Li, XL ;
Wu, XD .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (07) :1088-1099
[32]   Active Reranking for Web Image Search [J].
Tian, Xinmei ;
Tao, Dacheng ;
Hua, Xian-Sheng ;
Wu, Xiuqing .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (03) :805-820
[33]  
Tong S, 2001, P 9 ACM INT C MULT A
[34]  
Tuan Hue Thi, 2010, Proceedings 7th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2010), P204, DOI 10.1109/AVSS.2010.76
[35]   Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification [J].
Wang, Meng ;
Hong, Richang ;
Li, Guangda ;
Zha, Zheng-Jun ;
Yan, Shuicheng ;
Chua, Tat-Seng .
IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (04) :975-985
[36]   Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation [J].
Wang, Meng ;
Hua, Xian-Sheng ;
Tang, Jinhui ;
Hong, Richang .
IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (03) :465-476
[37]   Unified Video Annotation via Multigraph Learning [J].
Wang, Meng ;
Hua, Xian-Sheng ;
Hong, Richang ;
Tang, Jinhui ;
Qi, Guo-Jun ;
Song, Yan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2009, 19 (05) :733-746
[38]  
Wu SD, 2011, IEEE I CONF COMP VIS, P1419, DOI 10.1109/ICCV.2011.6126397
[39]  
Yamato J., 1992, Proceedings. 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.92CH3168-2), P379, DOI 10.1109/CVPR.1992.223161
[40]  
Yan Rong., 2003, P 11 ACM INT C MULTI, P343