Audio visual cues for video indexing and retrieval

被引:0
作者
Muneesawang, Paisarn [1 ]
Amin, Tahir [2 ]
Guan, Ling [2 ]
机构
[1] Dept. of Electrical and Computer Engineering, Naresuan University
[2] Dept. of Electrical and Computer Engineering, Ryerson University, Toronto, Ont.
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2004年 / 3331卷
关键词
Image retrieval - Video recording;
D O I
10.1007/978-3-540-30541-5_79
中图分类号
学科分类号
摘要
This paper studies content-based video retrieval using the combination of audio and visual features. The visual feature is extracted by an adaptive video indexing technique that places a strong emphasis on accurate characterization of spatio-temporal information within video clips. Audio feature is extracted by a statistical time-frequency analysis method that applies Laplacian mixture models to wavelet coefficients. The proposed joint audio-visual retrieval framework is highly flexible and scalable, and can be effectively applied to various types of video databases. © Springer-Verlag Berlin Heidelberg 2004.
引用
收藏
页码:642 / 649
页数:7
相关论文
共 13 条
[11]  
Saunders J., Real-Time Discrimination of Broadcast Speech /Music, IEEE Int. Conf. on Acoustic, Speech, and Signal Processing, 2, pp. 993-996, (1996)
[12]  
Bilmes J., A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report ICSI-TR-97-021, (1998)
[13]  
Rui Y., Huang T.S., Ortega M., Mehrotra S., Relevance feedback: A power tool for interactive content-based image retrieval, IEEE Trans. Circuits Syst. Video Tech., 8, 5, pp. 644-655, (1998)