TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation

被引:404
作者
Guillaumin, Matthieu [1 ,2 ]
Mensink, Thomas [1 ,2 ]
Verbeek, Jakob [1 ,2 ]
Schmid, Cordelia [1 ,2 ]
机构
[1] INRIA Grenoble, LEAR, Grenoble, France
[2] Lab Jean Kuntzmann, Grenoble, France
来源
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2009年
关键词
D O I
10.1109/ICCV.2009.5459266
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image auto-annotation is an important open problem in computer vision. For this task we propose TagProp, a discriminatively trained nearest neighbor model. Tags of test images are predicted using a weighted nearest-neighbor model to exploit labeled training images. Neighbor weights are based on neighbor rank or distance. TagProp allows the integration of metric learning by directly maximizing the log-likelihood of the tag predictions in the training set. In this manner, we can optimally combine a collection of image similarity metrics that cover different aspects of image content, such as local shape descriptors, or global color histograms. We also introduce a word specific sigmoidal modulation of the weighted neighbor tag predictions to boost the recall of rare words. We investigate the performance of different variants of our model and compare to existing work. We present experimental results for three challenging data sets. On all three, TagProp makes a marked improvement as compared to the current state-of-the-art.
引用
收藏
页码:309 / 316
页数:8
相关论文
共 27 条
  • [1] [Anonymous], ECCV
  • [2] [Anonymous], 2006, ECCV
  • [3] [Anonymous], 2003, **NON-TRADITIONAL**
  • [4] Matching words and pictures
    Barnard, K
    Duygulu, P
    Forsyth, D
    de Freitas, N
    Blei, DM
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) : 1107 - 1135
  • [5] Supervised learning of semantic classes for image annotation and retrieval
    Carneiro, Gustavo
    Chan, Antoni B.
    Moreno, Pedro J.
    Vasconcelos, Nuno
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (03) : 394 - 410
  • [6] Cusano C., 2004, P INTERNET IMAGING S, V5304
  • [7] Duygulu Pinar., 2002, ECCV
  • [8] Feng S.L., 2004, CVPR
  • [9] Globerson A., 2006, NIPS
  • [10] A discriminative kernel-based model to rank images from text queries
    Grangier, David
    Bengio, Samy
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (08) : 1371 - 1384