SLED: Semantic Label Embedding Dictionary Representation for Multilabel Image Annotation

被引:38
作者
Cao, Xiaochun [1 ,2 ]
Zhang, Hua [1 ,2 ]
Guo, Xiaojie [1 ]
Liu, Si [1 ]
Meng, Dan [1 ]
机构
[1] Chinese Acad Sci, State Key Lab Informat Secur, Inst Informat Engn, Beijing 100093, Peoples R China
[2] Tianjin Univ, Sch Comp Sci & Technol, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
DISCRIMINATIVE DICTIONARY; TAG COMPLETION; K-SVD; SPARSE; RECOGNITION; RELEVANCE; FEATURES;
D O I
10.1109/TIP.2015.2428055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing methods on weakly supervised image annotation rely on jointly unsupervised feature representation, the components of which are not directly correlated with specific labels. In practical cases, however, there is a big gap between the training and the testing data, say the label combination of the testing data is not always consistent with that of the training. To bridge the gap, this paper presents a semantic label embedding dictionary representation that not only achieves the discriminative feature representation for each label in the image, but also mines the semantic relevance between co-occurrence labels for context information. More specifically, to enhance the discriminative representation of labels, the training data is first divided into a set of overlapped groups by graph shift based on the exclusive label graph. Afterward, given a group of exclusive labels, we try to learn multiple label-specific dictionaries to explicitly decorrelate the feature representation of each label. A joint optimization approach is proposed according to the Fisher discrimination criterion for seeking its solution. Then, to discover the context information hidden in the co-occurrence labels, we explore the semantic relationship between visual words in dictionaries and labels in a multitask learning way with respect to the reconstruction coefficients of the training data. In the annotation stage, with the discriminative dictionaries and exclusive label groups as well as a group sparsity constraint, the reconstruction coefficients of a test image can be easily obtained. Finally, we introduce a label propagation scheme to compute the score of each label for the test image based on its reconstruction coefficients. Experimental results on three challenging data sets demonstrate that our proposed method leads to significant performance gains over existing methods.
引用
收藏
页码:2746 / 2759
页数:14
相关论文
共 55 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]   AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION [J].
ALTMAN, NS .
AMERICAN STATISTICIAN, 1992, 46 (03) :175-185
[3]  
Altun Y., 2004, ICML '04: Proceedings of the twenty-first international conference on Machine learning, New York, NY, USA, DOI DOI 10.1145/1015330.1015341
[4]  
[Anonymous], 2010, Proceedings of the Eighteenth ACM International Conference on Multimedia
[5]  
[Anonymous], 2013, International conference on machine learning
[6]  
[Anonymous], 2006, Advances in Neural Information Processing Systems
[7]   A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration [J].
Bioucas-Dias, Jose M. ;
Figueiredo, Mario A. T. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (12) :2992-3004
[8]   Supervised learning of semantic classes for image annotation and retrieval [J].
Carneiro, Gustavo ;
Chan, Antoni B. ;
Moreno, Pedro J. ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (03) :394-410
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]  
Chen XY, 2011, IEEE I CONF COMP VIS, P834, DOI 10.1109/ICCV.2011.6126323