With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations

被引：250

作者：

Dwibedi, Debidatta ^{[1
]}

Aytar, Yusuf ^{[2
]}

Tompson, Jonathan ^{[1
]}

Sermanet, Pierre ^{[1
]}

Zisserman, Andrew ^{[2
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

[2] DeepMind, London, England

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

关键词：

D O I：

10.1109/ICCV48922.2021.00945

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Self-supervised learning algorithms based on instance discrimination train encoders to be invariant to pre-defined transformations of the same instance. While most methods treat different views of the same image as positives for a contrastive loss, we are interested in using positives from other instances in the dataset. Our method, Nearest-Neighbor Contrastive Learning of visual Representations (NNCLR), samples the nearest neighbors from the dataset in the latent space, and treats them as positives. This provides more semantic variations than pre-defined transformations. We find that using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification using ResNet-50 under the linear evaluation protocol, from 71.7% to 75.6%, outperforming previous state-of-the-art methods. On semi-supervised learning benchmarks we improve performance significantly when only 1% ImageNet labels are available, from 53.8% to 56.5%. On transfer learning benchmarks our method outperforms state-of-the-art methods (including supervised learning with ImageNet) on 8 out of 12 downstream datasets. Furthermore, we demonstrate empirically that our method is less reliant on complex data augmentations. We see a relative reduction of only 2.1% ImageNet Top-1 accuracy when we train using only random crops.

引用

页码：9568 / 9577

页数：10

共 60 条

[1]

[Anonymous], The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results

[2]

[Anonymous], 2004, COMP VIS PATT REC WO

[3]

Asano YM, 2019, Proc. ICLR

[4]

Berg Thomas, 2014, P C COMP VIS PATT RE, P7

[5]

Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29

[6]

Brox, 2014, DISCRIMINATIVE UNSUP

[7]

Caron M, 2020, ADV NEUR IN, V33

[8] Unsupervised Pre-Training of Image Features on Non-Curated Data [J].

Caron, Mathilde ;

Bojanowski, Piotr ;

Mairal, Julien ;

Joulin, Armand .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2959-2968

[9] Deep Clustering for Unsupervised Learning of Visual Features [J].

Caron, Mathilde ;

Bojanowski, Piotr ;

Joulin, Armand ;

Douze, Matthijs .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156

[10]

Chechik Gal, 2010, Large scale online learning of image similarity through ranking

← 1 2 3 4 5 6 →