Semantic video indexing using context-dependent fusion

被引:0
作者
Kim, Dae-Jin [1 ]
Frigui, Hichem [1 ]
Fadeev, Aleksey [1 ]
机构
[1] Univ Louisville, CECS Dept, Louisville, KY 40292 USA
来源
MULTIMEDIA CONTENT ACCESS: ALGORITHMS AND SYSTEMS II | 2008年 / 6820卷
关键词
video summary; semantic indexing; algorithm fusion;
D O I
10.1117/12.766542
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel method for fusing the results of multiple semantic video indexing algorithms that use different types of feature descriptors and different classification methods. This method, called Context-Dependent Fusion (CDF), is motivated by the fact that the relative performance of different semantic indexing methods can vary significantly depending on the video type, context information, and the high-level concept of the video segment to be labeled. The training part of CDF has two main components: context extraction and algorithm fusion. In context extraction, the low-level audio-visual descriptors used by the different-classification algorithms are combined and used to partition the descriptors space into groups of similar video shots, or contexts. The algorithm fusion component identifies a subset of classification algorithms (local experts) for each context based on their relative performance within the context. Results on the TRECVID-2002 data collections show that the proposed method can identify meaningful and coherent clusters and that different labeling algorithms can be identified for-the different contexts. Our initial experiments have indicated that the context-dependent fusion outperforms the individual algorithms. We also show that using simple visual descriptors and a simple K-NN classifier, the CDF approach provides results that are comparable to the state-of-the-art methods in semantic indexing.
引用
收藏
页数:11
相关论文
共 31 条
  • [1] A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video
    Antani, S
    Kasturi, R
    Jain, R
    [J]. PATTERN RECOGNITION, 2002, 35 (04) : 945 - 965
  • [2] Benmokhtar R, 2006, LECT NOTES COMPUT SC, V4132, P65
  • [3] Automatic key frame selection using a wavelet based approach
    Campisi, P
    Longari, A
    Neri, A
    [J]. WAVELET APPLICATIONS IN SIGNAL AND IMAGE PROCESSING VII, 1999, 3813 : 861 - 872
  • [4] CHANG SK, 1990, INT J VISUAL LANGUAG, V1, P41
  • [5] A fuzzy video content representation for video summarization and content-based retrieval
    Doulamis, AD
    Doulamis, ND
    Kollias, SD
    [J]. SIGNAL PROCESSING, 2000, 80 (06) : 1049 - 1067
  • [6] Automatic soccer video analysis and summarization
    Ekin, A
    Tekalp, AM
    Mehrotra, R
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2003, 12 (07) : 796 - 807
  • [7] Unsupervised learning of prototypes and attribute weights
    Frigui, H
    Nasraoui, O
    [J]. PATTERN RECOGNITION, 2004, 37 (03) : 567 - 581
  • [8] FRIGUI H, 2007, P SPIE C
  • [9] FRIGUI H, 2007, P SIAM INT C DAT MIN
  • [10] Fusion of handwritten word classifiers
    Gader, PD
    Mohamed, MA
    Keller, JM
    [J]. PATTERN RECOGNITION LETTERS, 1996, 17 (06) : 577 - 584