Audio visual cues for video indexing and retrieval

被引:0
作者
Muneesawang, Paisarn [1 ]
Amin, Tahir [2 ]
Guan, Ling [2 ]
机构
[1] Dept. of Electrical and Computer Engineering, Naresuan University
[2] Dept. of Electrical and Computer Engineering, Ryerson University, Toronto, Ont.
来源
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2004年 / 3331卷
关键词
Image retrieval - Video recording;
D O I
10.1007/978-3-540-30541-5_79
中图分类号
学科分类号
摘要
This paper studies content-based video retrieval using the combination of audio and visual features. The visual feature is extracted by an adaptive video indexing technique that places a strong emphasis on accurate characterization of spatio-temporal information within video clips. Audio feature is extracted by a statistical time-frequency analysis method that applies Laplacian mixture models to wavelet coefficients. The proposed joint audio-visual retrieval framework is highly flexible and scalable, and can be effectively applied to various types of video databases. © Springer-Verlag Berlin Heidelberg 2004.
引用
收藏
页码:642 / 649
页数:7
相关论文
共 13 条
[1]  
Chang Y.-L., Zeng W., Kamel I., Alonso R., Integrated image and speech analysis for content-based video indexing, Proc. of IEEE Int. Conf. on Multimedia Computing and Systems, pp. 306-313, (1996)
[2]  
Dahyot R., Kokaram A., Rea N., Denman H., Joint audio visual retrieval for tennis broadcasts, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 3, pp. 561-564
[3]  
Saraceno C., Video content extraction and representation using a joint audio and video processing, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 6, pp. 3033-3036, (1999)
[4]  
Huang J., Liu Z., Wang Y., Chen Y., Wong E.K., Integration of multimodal features for video scene classification based on HMM, IEEE Workshop on Multimedia Signal Processing, pp. 53-58, (1999)
[5]  
Jasinschi R.S., Dimitrova N., McGee T., Agnihotri L., Zimmerman J., Li D., Louie J., A probabilistic layered framework fro integrating multimedia content and context information, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 2, pp. 2057-2060, (2002)
[6]  
Naphade M.R., Huang T.S., Extracting semantics from audiovisual content: The final frontier in multimedia retrieval, IEEE Trans. on Neural Networks, 13, 4, pp. 793-810, (2002)
[7]  
Muneesawang P., Guan L., Video retrieval using an adaptive video indexing technique and automatic relevance feedback, IEEE Workshop on Multimedia Signal Processing, pp. 220-223, (2003)
[8]  
Kohonen T., Self-organising MAPS, 2nd Ed., (1997)
[9]  
Crouse M.S., Nowak R.D., Baraniuk R.G., Wavelet-based statistical signal processing using hidden Markov models, IEEE Transactions on Signal Processing, 46, 4, pp. 886-902, (1998)
[10]  
Wold E., Blum T., Keislar D., Wheaton J., Content-based classificaiton, search and retrieval of audio, IEEE Multimedia, 3, 3, pp. 27-36, (1996)