Image Classification with the Fisher Vector: Theory and Practice

被引:1055
作者
Sanchez, Jorge [1 ]
Perronnin, Florent [2 ]
Mensink, Thomas [3 ]
Verbeek, Jakob [4 ]
机构
[1] Univ Nacl Cordoba, FAMAF, CONICET, CIEM, RA-5000 Cordoba, Argentina
[2] Xerox Res Ctr Europe, F-38240 Meylan, France
[3] Univ Amsterdam, Inteligent Syst Lab Amsterdam, Amsterdam, Netherlands
[4] INRIA Grenoble, LEAR Team, F-38330 Montbonnot St Martin, France
关键词
Image classification; Large-scale classification; Bag-of-Visual words; Fisher vector; Fisher kernel; Product quantization;
D O I
10.1007/s11263-013-0636-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A standard approach to describe an image for classification and retrieval purposes is to extract a set of local patch descriptors, encode them into a high dimensional vector and pool them into an image-level signature. The most common patch encoding strategy consists in quantizing the local descriptors into a finite set of prototypical elements. This leads to the popular Bag-of-Visual words representation. In this work, we propose to use the Fisher Kernel framework as an alternative patch encoding strategy: we describe patches by their deviation from an "universal" generative Gaussian mixture model. This representation, which we call Fisher vector has many advantages: it is efficient to compute, it leads to excellent results even with efficient linear classifiers, and it can be compressed with a minimal loss of accuracy using product quantization. We report experimental results on five standard datasets-PASCAL VOC 2007, Caltech 256, SUN 397, ILSVRC 2010 and ImageNet10K-with up to 9M images and 10K classes, showing that the FV framework is a state-of-the-art patch encoding technique.
引用
收藏
页码:222 / 245
页数:24
相关论文
共 78 条
  • [1] [Anonymous], 2010, CVPR
  • [2] [Anonymous], NIPS WORKSH DEEP LEA
  • [3] [Anonymous], PAMI
  • [4] [Anonymous], 2002, CAMBRIDGE U ENG DEP
  • [5] [Anonymous], STOCHASTIC GRADIENT
  • [6] [Anonymous], 2011, CVPR
  • [7] [Anonymous], 2012, ICML
  • [8] [Anonymous], CVPR
  • [9] [Anonymous], 2001, NIPS
  • [10] [Anonymous], CVPR