Learning Object Categories From Internet Image Searches

被引：50

作者：

Fergus, Rob ^{[1
]}

Fei-Fei, Li ^{[2
]}

Perona, Pietro ^{[3
]}

Zisserman, Andrew ^{[4
]}

机构：

[1] Courant Inst, Dept Comp Sci, New York, NY 10003 USA

[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

[3] CALTECH, Dept Elect Engn, Pasadena, CA 91125 USA

[4] Univ Oxford, Dept Engn Sci, Oxford OX1 3PJ, England

来源：

PROCEEDINGS OF THE IEEE | 2010年 / 98卷 / 08期

基金：

欧洲研究理事会; 英国工程与自然科学研究理事会;

关键词：

Internet image search engines; learning; object categories; recognition; unsupervised; SCALE;

D O I：

10.1109/JPROC.2010.2048990

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models "on-the-fly.'' We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets.

引用

页码：1453 / 1466

页数：14

共 45 条

[31] Quelhas P, 2005, IEEE I CONF COMP VIS, P883
[32] Neural network-based face detection
Rowley, HA
Baluja, S
Kanade, T
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (01) : 23 - 38
[33] RUSSELL B, 2006, LABELME OPEN ANNOTAT
[34] SCHROFF F, 2007, P IEEE 11 INT C COMP, DOI DOI 10.1109/ICCV.2007.4409099
[35] Video Google: A text retrieval approach to object matching in videos
Sivic, J
Zisserman, A
[J]. NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, : 1470 - +
[36] Sorokin A, 2008, PROC CVPR IEEE, P23
[37] Sudderth EB, 2005, IEEE I CONF COMP VIS, P1331
[38] SUDDERTH EB, 2005, ADV NEURAL INFORM PR, V18, P1299
[39] Torralba A, 2004, PROC CVPR IEEE, P762
[40] VANAHN L, 2006, ESP GAME

← 1 2 3 4 5 →