Efficient Search and Localization of Human Actions in Video Databases

被引：58

作者：

Shao, Ling ^{[1
,2
]}

Jones, Simon ^{[2
]}

Li, Xuelong ^{[3
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China

[2] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England

[3] Chinese Acad Sci, Xian Inst Opt & Precis Mech, State Key Lab Transient Opt & Photon, Ctr Opt Imagery Anal & Learning OPTIMAL, Xian 710119, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2014年 / 24卷 / 03期

基金：

中国国家自然科学基金; 英国工程与自然科学研究理事会;

关键词：

Human actions; relevance feedback; spatio-temporal localization; video retrieval; RELEVANCE FEEDBACK; RECOGNITION; RETRIEVAL;

D O I：

10.1109/TCSVT.2013.2276700

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As digital video databases grow, so grows the problem of effectively navigating through them. In this paper we propose a novel content-based video retrieval approach to searching such video databases, specifically those involving human actions, incorporating spatio-temporal localization. We outline a novel, highly efficient localization model that first performs temporal localization based on histograms of evenly spaced time-slices, then spatial localization based on histograms of a 2-D spatial grid. We further argue that our retrieval model, based on the aforementioned localization, followed by relevance ranking, results in a highly discriminative system, while remaining an order of magnitude faster than the current state-of-the-art method. We also show how relevance feedback can be applied to our localization and ranking algorithms. As a result, the presented system is more directly applicable to real-world problems than any prior content-based video retrieval system.

引用

页码：504 / 512

页数：9

共 43 条

[1] Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning [J].

Ali, Saad ;

Shah, Mubarak .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (02) :288-303

[2]

[Anonymous], 1971, The SMART Retrieval System-Experiments in Automatic Document Processing

[3]

Arman E., 1994, Proceedings ACM Multimedia '94, P97, DOI 10.1145/192593.192630

[4]

Bentley J., 1984, Communications of the ACM, V27, P865, DOI 10.1145/358234.381162

[5] Biased Discriminant Euclidean Embedding for Content-Based Image Retrieval [J].

Bian, Wei ;

Tao, Dacheng .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (02) :545-554

[6] The representation and recognition of human movement using temporal templates [J].

Davis, JW ;

Bobick, AF .

1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :928-934

[7]

Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899

[8] Actions as space-time shapes [J].

Gorelick, Lena ;

Blank, Moshe ;

Shechtman, Eli ;

Irani, Michal ;

Basri, Ronen .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (12) :2247-2253

[9]

Hong PY, 2000, 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, P750

[10] Actor-independent action search using spatiotemporal vocabulary with appearance hashing [J].

Ji, Rongrong ;

Yao, Hongxun ;

Sun, Xiaoshuai .

PATTERN RECOGNITION, 2011, 44 (03) :624-638

← 1 2 3 4 5 →