Performance evaluation of early and late fusion methods for generic semantics indexing

被引:33
作者
Dong, Yuan [1 ]
Gao, Shan [1 ]
Tao, Kun [2 ]
Liu, Jiqing [1 ]
Wang, Haila [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100088, Peoples R China
[2] France Telecom R&D Beijing Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic indexing; Concept detection; Multiple kernel learning; Classifier-level fusion; Visual feature extraction; SUPPORT VECTOR MACHINES; CLASSIFICATION; HISTOGRAMS; KERNELS;
D O I
10.1007/s10044-013-0336-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on the comparison between two fusion methods, namely early fusion and late fusion. The former fusion is carried out at kernel level, also known as multiple kernel learning, and in the latter, the modalities are fused through logistic regression at classifier score level. Two kinds of multilayer fusion structures, differing in the quantities of feature/kernel groups in a lower fusion layer, are constructed for early and late fusion systems, respectively. The goal of these fusion methods is to put each of various features into effect and mine redundant information of the combination of them, and then to develop a generic and robust semantic indexing system to bridge semantic gap between human concepts and these low-level visual features. Performance evaluated on both TRECVID2009 and TRECVID2010 datasets demonstrates that the systems with our proposed multilayer fusion methods at kernel level perform more stably to reach the goal than the classification-score-level fusion; the most effective and robust one with highest MAP score is constructed by early fusion with two-layer equally weighted composite kernel learning.
引用
收藏
页码:37 / 50
页数:14
相关论文
共 37 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]  
Amir A, 2005, NIST TRECVID 2005 WO
[3]  
[Anonymous], 1997, Proceedings of the 4th ACM International Conference on Multimedia, MULTIMEDIA 1996, DOI DOI 10.1145/244130.244148
[4]  
[Anonymous], **NON-TRADITIONAL**
[5]  
Ayache S, 2006, TREC VID RETR WORKSH
[6]   Scene classification using a hybrid generative/discriminative approach [J].
Bosch, Anna ;
Zisserman, Andrew ;
Munoz, Xavier .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (04) :712-727
[7]   SVMTorch: Support vector machines for large-scale regression problems [J].
Collobert, R ;
Bengio, S .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :143-160
[8]  
Cooper M, 2005, TREC VID RETR WORKSH
[9]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[10]  
Donald Kieran Mc, 2005, COMP SCORE RANK PROB