Audio visual cues for video indexing and retrieval

被引：0

作者：

Muneesawang, Paisarn ^{[1
]}

Amin, Tahir ^{[2
]}

Guan, Ling ^{[2
]}

机构：

[1] Dept. of Electrical and Computer Engineering, Naresuan University

[2] Dept. of Electrical and Computer Engineering, Ryerson University, Toronto, Ont.

来源：

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2004年 / 3331卷

关键词：

Image retrieval - Video recording;

D O I：

10.1007/978-3-540-30541-5_79

中图分类号：

学科分类号：

摘要：

This paper studies content-based video retrieval using the combination of audio and visual features. The visual feature is extracted by an adaptive video indexing technique that places a strong emphasis on accurate characterization of spatio-temporal information within video clips. Audio feature is extracted by a statistical time-frequency analysis method that applies Laplacian mixture models to wavelet coefficients. The proposed joint audio-visual retrieval framework is highly flexible and scalable, and can be effectively applied to various types of video databases. © Springer-Verlag Berlin Heidelberg 2004.

引用

页码：642 / 649

页数：7

共 13 条

[11]

Saunders J., Real-Time Discrimination of Broadcast Speech /Music, IEEE Int. Conf. on Acoustic, Speech, and Signal Processing, 2, pp. 993-996, (1996)

[12]

Bilmes J., A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report ICSI-TR-97-021, (1998)

[13]

Rui Y., Huang T.S., Ortega M., Mehrotra S., Relevance feedback: A power tool for interactive content-based image retrieval, IEEE Trans. Circuits Syst. Video Tech., 8, 5, pp. 644-655, (1998)

← 1 2 →