Audio visual cues for video indexing and retrieval

被引：0

作者：

Muneesawang, Paisarn ^{[1
]}

Amin, Tahir ^{[2
]}

Guan, Ling ^{[2
]}

机构：

[1] Dept. of Electrical and Computer Engineering, Naresuan University

[2] Dept. of Electrical and Computer Engineering, Ryerson University, Toronto, Ont.

来源：

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2004年 / 3331卷

关键词：

Image retrieval - Video recording;

D O I：

10.1007/978-3-540-30541-5_79

中图分类号：

学科分类号：

摘要：

This paper studies content-based video retrieval using the combination of audio and visual features. The visual feature is extracted by an adaptive video indexing technique that places a strong emphasis on accurate characterization of spatio-temporal information within video clips. Audio feature is extracted by a statistical time-frequency analysis method that applies Laplacian mixture models to wavelet coefficients. The proposed joint audio-visual retrieval framework is highly flexible and scalable, and can be effectively applied to various types of video databases. © Springer-Verlag Berlin Heidelberg 2004.

引用

页码：642 / 649

页数：7

共 13 条

[1]

Chang Y.-L., Zeng W., Kamel I., Alonso R., Integrated image and speech analysis for content-based video indexing, Proc. of IEEE Int. Conf. on Multimedia Computing and Systems, pp. 306-313, (1996)

[2]

Dahyot R., Kokaram A., Rea N., Denman H., Joint audio visual retrieval for tennis broadcasts, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 3, pp. 561-564

[3]

Saraceno C., Video content extraction and representation using a joint audio and video processing, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 6, pp. 3033-3036, (1999)

[4]

Huang J., Liu Z., Wang Y., Chen Y., Wong E.K., Integration of multimodal features for video scene classification based on HMM, IEEE Workshop on Multimedia Signal Processing, pp. 53-58, (1999)

[5]

Jasinschi R.S., Dimitrova N., McGee T., Agnihotri L., Zimmerman J., Li D., Louie J., A probabilistic layered framework fro integrating multimedia content and context information, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 2, pp. 2057-2060, (2002)

[6]

Naphade M.R., Huang T.S., Extracting semantics from audiovisual content: The final frontier in multimedia retrieval, IEEE Trans. on Neural Networks, 13, 4, pp. 793-810, (2002)

[7]

Muneesawang P., Guan L., Video retrieval using an adaptive video indexing technique and automatic relevance feedback, IEEE Workshop on Multimedia Signal Processing, pp. 220-223, (2003)

[8]

Kohonen T., Self-organising MAPS, 2nd Ed., (1997)

[9]

Crouse M.S., Nowak R.D., Baraniuk R.G., Wavelet-based statistical signal processing using hidden Markov models, IEEE Transactions on Signal Processing, 46, 4, pp. 886-902, (1998)

[10]

Wold E., Blum T., Keislar D., Wheaton J., Content-based classificaiton, search and retrieval of audio, IEEE Multimedia, 3, 3, pp. 27-36, (1996)

← 1 2 →