Efficient Search and Localization of Human Actions in Video Databases

被引:58
作者
Shao, Ling [1 ,2 ]
Jones, Simon [2 ]
Li, Xuelong [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
[3] Chinese Acad Sci, Xian Inst Opt & Precis Mech, State Key Lab Transient Opt & Photon, Ctr Opt Imagery Anal & Learning OPTIMAL, Xian 710119, Shaanxi, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Human actions; relevance feedback; spatio-temporal localization; video retrieval; RELEVANCE FEEDBACK; RECOGNITION; RETRIEVAL;
D O I
10.1109/TCSVT.2013.2276700
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As digital video databases grow, so grows the problem of effectively navigating through them. In this paper we propose a novel content-based video retrieval approach to searching such video databases, specifically those involving human actions, incorporating spatio-temporal localization. We outline a novel, highly efficient localization model that first performs temporal localization based on histograms of evenly spaced time-slices, then spatial localization based on histograms of a 2-D spatial grid. We further argue that our retrieval model, based on the aforementioned localization, followed by relevance ranking, results in a highly discriminative system, while remaining an order of magnitude faster than the current state-of-the-art method. We also show how relevance feedback can be applied to our localization and ranking algorithms. As a result, the presented system is more directly applicable to real-world problems than any prior content-based video retrieval system.
引用
收藏
页码:504 / 512
页数:9
相关论文
共 43 条
[1]   Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning [J].
Ali, Saad ;
Shah, Mubarak .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (02) :288-303
[2]  
[Anonymous], 1971, The SMART Retrieval System-Experiments in Automatic Document Processing
[3]  
Arman E., 1994, Proceedings ACM Multimedia '94, P97, DOI 10.1145/192593.192630
[4]  
Bentley J., 1984, Communications of the ACM, V27, P865, DOI 10.1145/358234.381162
[5]   Biased Discriminant Euclidean Embedding for Content-Based Image Retrieval [J].
Bian, Wei ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (02) :545-554
[6]   The representation and recognition of human movement using temporal templates [J].
Davis, JW ;
Bobick, AF .
1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :928-934
[7]  
Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899
[8]   Actions as space-time shapes [J].
Gorelick, Lena ;
Blank, Moshe ;
Shechtman, Eli ;
Irani, Michal ;
Basri, Ronen .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (12) :2247-2253
[9]  
Hong PY, 2000, 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, P750
[10]   Actor-independent action search using spatiotemporal vocabulary with appearance hashing [J].
Ji, Rongrong ;
Yao, Hongxun ;
Sun, Xiaoshuai .
PATTERN RECOGNITION, 2011, 44 (03) :624-638