A multi-modal system for the retrieval of semantic video events

被引:13
作者
Amir, A
Basu, S
Iyengar, G
Lin, CY
Naphade, M
Smith, JR
Srinivasan, S
Tseng, B
机构
[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
[2] IBM TJ Watson Res Ctr, Hawthorne, NY 10532 USA
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
multimedia indexing; event detection; semantic video annotation; content-based video retrieval;
D O I
10.1016/j.cviu.2004.02.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A framework for event detection is proposed where events, objects, and other semantic concepts are detected from video using trained classifiers. These classifiers are used to automatically annotate video with semantic labels, which in turn are used to search for new, untrained types of events and semantic concepts. The novelty of the approach lies in the: (1) semi-automatic construction of models of events from feature descriptors and (2) integration of content-based and concept-based querying in the search process. Speech retrieval is independently applied and combined results are produced. Results of applying these to the Search benchmark of the NIST TREC Video track 2001 are reported, and the gained experience and future work are discussed. (C) 2004 Published by Elsevier Inc.
引用
收藏
页码:216 / 236
页数:21
相关论文
共 33 条
[1]   Human motion analysis: A review [J].
Aggarwal, JK ;
Cai, Q .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 73 (03) :428-440
[2]  
AMIR A, 2003, P NIST TRECVID 2003
[3]  
ANJUM A, 2001, P IEEE WORKSH DET RE
[4]   Coupled hidden Markov models for complex action recognition [J].
Brand, M ;
Oliver, N ;
Pentland, A .
1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :994-999
[5]   Toward speech as a knowledge resource [J].
Brown, EW ;
Srinivasan, S ;
Coden, A ;
Ponceleon, D ;
Cooper, JW ;
Amir, A .
IBM SYSTEMS JOURNAL, 2001, 40 (04) :985-1001
[6]  
DELBIMBO A, 1999, VISUAL INFORMATION R
[7]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]  
Duda R. O., 1973, PATTERN CLASSIFICATI
[9]  
FLICKNER M, 1995, IEEE COMPUT, V28, P23, DOI DOI 10.1109/2.410146
[10]  
GAROFOLO J, 1999, TREC SPOKEN DOCUMENT