A Bayesian approach to unsupervised one-shot learning of object categories

被引:304
作者
Fei-Fei, L [1 ]
Fergus, R [1 ]
Perona, P [1 ]
机构
[1] CALTECH, Dept Elect Engn, Pasadena, CA 91125 USA
来源
NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS | 2003年
关键词
D O I
10.1109/ICCV.2003.1238476
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images (1 similar to 5). It is based on incorporating "generic" knowledge which may be obtained from previously learnt models of unrelated categories. We operate in a variational Bayesian framework: object categories are represented by probabilistic models, and "prior" knowledge is represented as a probability density function on the parameters of these models. The "posterior" model for an object category is obtained by updating the prior in the light of one or more observations. Our ideas are demonstrated on four diverse categories (human faces, airplanes, motorcycles, spotted cats). Initially three categories are learnt from hundreds of training examples, and a "prior" is estimated from these. Then the model of the fourth category is learnt from I to 5 training examples, and is used for detecting new exemplars a set of test images.
引用
收藏
页码:1134 / 1141
页数:8
相关论文
共 18 条
[1]   A computational model for visual selection [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1999, 11 (07) :1691-1715
[2]  
[Anonymous], P NEUR INF PROC SYST
[3]  
Attias H, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P21
[4]   RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING [J].
BIEDERMAN, I .
PSYCHOLOGICAL REVIEW, 1987, 94 (02) :115-147
[5]  
BURL M, 1998, P EUR C COMP VIS, P628
[6]  
Fergus R, 2003, PROC CVPR IEEE, P264
[7]   An introduction to variational methods for graphical models [J].
Jordan, MI ;
Ghahramani, Z ;
Jaakkola, TS ;
Saul, LK .
MACHINE LEARNING, 1999, 37 (02) :183-233
[8]   Saliency, scale and image description [J].
Kadir, T ;
Brady, M .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2001, 45 (02) :83-105
[9]  
Knill DC., 1996, Perception as Bayesian Inference
[10]  
MacKay DJC, 1995, P NIPS