Semantic video indexing using context-dependent fusion

被引：0

作者：

Kim, Dae-Jin ^{[1
]}

Frigui, Hichem ^{[1
]}

Fadeev, Aleksey ^{[1
]}

机构：

[1] Univ Louisville, CECS Dept, Louisville, KY 40292 USA

来源：

MULTIMEDIA CONTENT ACCESS: ALGORITHMS AND SYSTEMS II | 2008年 / 6820卷

关键词：

video summary; semantic indexing; algorithm fusion;

D O I：

10.1117/12.766542

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a novel method for fusing the results of multiple semantic video indexing algorithms that use different types of feature descriptors and different classification methods. This method, called Context-Dependent Fusion (CDF), is motivated by the fact that the relative performance of different semantic indexing methods can vary significantly depending on the video type, context information, and the high-level concept of the video segment to be labeled. The training part of CDF has two main components: context extraction and algorithm fusion. In context extraction, the low-level audio-visual descriptors used by the different-classification algorithms are combined and used to partition the descriptors space into groups of similar video shots, or contexts. The algorithm fusion component identifies a subset of classification algorithms (local experts) for each context based on their relative performance within the context. Results on the TRECVID-2002 data collections show that the proposed method can identify meaningful and coherent clusters and that different labeling algorithms can be identified for-the different contexts. Our initial experiments have indicated that the context-dependent fusion outperforms the individual algorithms. We also show that using simple visual descriptors and a simple K-NN classifier, the CDF approach provides results that are comparable to the state-of-the-art methods in semantic indexing.

引用

页数：11

共 31 条

[1] A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video
Antani, S
Kasturi, R
Jain, R
[J]. PATTERN RECOGNITION, 2002, 35 (04) : 945 - 965
[2] Benmokhtar R, 2006, LECT NOTES COMPUT SC, V4132, P65
[3] Automatic key frame selection using a wavelet based approach
Campisi, P
Longari, A
Neri, A
[J]. WAVELET APPLICATIONS IN SIGNAL AND IMAGE PROCESSING VII, 1999, 3813 : 861 - 872
[4] CHANG SK, 1990, INT J VISUAL LANGUAG, V1, P41
[5] A fuzzy video content representation for video summarization and content-based retrieval
Doulamis, AD
Doulamis, ND
Kollias, SD
[J]. SIGNAL PROCESSING, 2000, 80 (06) : 1049 - 1067
[6] Automatic soccer video analysis and summarization
Ekin, A
Tekalp, AM
Mehrotra, R
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2003, 12 (07) : 796 - 807
[7] Unsupervised learning of prototypes and attribute weights
Frigui, H
Nasraoui, O
[J]. PATTERN RECOGNITION, 2004, 37 (03) : 567 - 581
[8] FRIGUI H, 2007, P SPIE C
[9] FRIGUI H, 2007, P SIAM INT C DAT MIN
[10] Fusion of handwritten word classifiers
Gader, PD
Mohamed, MA
Keller, JM
[J]. PATTERN RECOGNITION LETTERS, 1996, 17 (06) : 577 - 584

← 1 2 3 4 →