Universal and adapted vocabularies for Generic Visual Categorization

被引:131
作者
Perronnin, Florent [1 ]
机构
[1] Xerox Res Ctr Europe, F-38240 Meylan, France
关键词
image categorization; bag-of-words; Gaussian mixture model; expectation-maximization; Bayesian adaptation;
D O I
10.1109/TPAMI.2007.70755
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generic Visual Categorization (GVC) is the pattern classification problem that consists in assigning labels to an image based on its semantic content. This is a challenging task as one has to deal with inherent object/scene variations, as well as changes in viewpoint, lighting, and occlusion. Several state-of-the-art GVC systems use a vocabulary of visual terms to characterize images with a histogram of visual word counts. We propose a novel practical approach to GVC based on a universal vocabulary, which describes the content of all the considered classes of images, and class vocabularies obtained through the adaptation of the universal vocabulary using class-specific data. The main novelty is that an image is characterized by a set of histograms-one per class-where each histogram describes whether the image content is best modeled by the universal vocabulary or the corresponding class vocabulary. This framework is applied to two types of local image features: low-level descriptors such as the popular SIFT and high-level histograms of word co-occurrences in a spatial neighborhood. It is shown experimentally on two challenging data sets (an in-house database of 19 categories and the PASCAL VOC 2006 data set) that the proposed approach exhibits state-of-the-art performance at a modest computational cost.
引用
收藏
页码:1243 / 1256
页数:14
相关论文
共 37 条
[1]  
AGARWAL A, 2006, P 9 EUR C COMP VIS
[2]  
[Anonymous], P EUR C COMP VIS WOR
[3]  
[Anonymous], A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models
[4]  
[Anonymous], 2005, P 10 IEEE INT C COMP
[5]   Matching words and pictures [J].
Barnard, K ;
Duygulu, P ;
Forsyth, D ;
de Freitas, N ;
Blei, DM ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1107-1135
[6]  
Bosch A., 2006, P 9 EUR C COMP VIS
[7]  
CARBONETTO P, 2004, P 8 EUR C COMP VIS
[8]  
CSURKA G, 2005, P 13 INT C IM AN PRO
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]  
Everingham M., 2006, SEL P 1 PASCAL CHALL