Exploring the Limits of Weakly Supervised Pretraining

被引：766

作者：

Mahajan, Dhruv ^{[1
]}

Girshick, Ross ^{[1
]}

Ramanathan, Vignesh ^{[1
]}

He, Kaiming ^{[1
]}

Paluri, Manohar ^{[1
]}

Li, Yixuan ^{[1
]}

Bharambe, Ashwin ^{[1
]}

van der Maaten, Laurens ^{[1
]}

机构：

[1] Facebook, Menlo Pk, CA 94025 USA

来源：

COMPUTER VISION - ECCV 2018, PT II | 2018年 / 11206卷

关键词：

D O I：

10.1007/978-3-030-01216-8_12

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models. Yet, ImageNet is now nearly ten years old and is by modern standards "small". Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger. The reasons are obvious: such datasets are difficult to collect and annotate. In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. Our experiments demonstrate that training for large-scale hashtag prediction leads to excellent results. We show improvements on several image classification and object detection tasks, and report the highest ImageNet-1k single-crop, top-1 accuracy to date: 85.4% (97.6% top-5). We also perform extensive experiments that provide novel empirical data on the relationship between large-scale pretraining and transfer learning performance.

引用

页码：185 / 201

页数：17

共 47 条

[1]

Agrawal P, 2014, LECT NOTES COMPUT SC, V8695, P329, DOI 10.1007/978-3-319-10584-0_22

[2]

[Anonymous], 2017, 31 AAAI C ART INT AA

[3]

[Anonymous], 2015, ICCV

[4]

[Anonymous], 2016, PROC INT C LEARN REP

[5]

[Anonymous], 2005, PROC CVPR IEEE

[6]

[Anonymous], 2010, About wordnet

[7]

[Anonymous], 2013, 31 INT C MACH LEARN

[8]

[Anonymous], 2010, CALTECH UCSD BIRDS

[9]

[Anonymous], 2016, CVPR

[10]

[Anonymous], 2014, EUROPEAN C COMPUTER

← 1 2 3 4 5 →