Momentum Contrast for Unsupervised Visual Representation Learning

被引:8849
作者
He, Kaiming [1 ]
Fan, Haoqi [1 ]
Wu, Yuxin [1 ]
Xie, Saining [1 ]
Girshick, Ross [1 ]
机构
[1] Facebook AI Res FAIR, Menlo Pk, CA 94025 USA
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
关键词
D O I
10.1109/CVPR42600.2020.00975
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning [29] as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks. MoCo can outperform its supervised pre-training counterpart in 7 detection/segmentation tasks on PASCAL VOC, COCO, and other datasets, sometimes surpassing it by large margins. This suggests that the gap between unsupervised and supervised representation learning has been largely closed in many vision tasks.
引用
收藏
页码:9726 / 9735
页数:10
相关论文
共 66 条
[1]  
[Anonymous], 2011, IEEE I CONF COMP VIS
[2]  
[Anonymous], IEEE I CONF COMP VIS
[3]  
[Anonymous], 2021, AUTOPHAGY, DOI DOI 10.1080/15548627.2020.1810918
[4]  
[Anonymous], 2011, P 28 INT C INT C MAC
[5]  
Bachman P, 2019, ADV NEUR IN, V32
[6]   Unsupervised Pre-Training of Image Features on Non-Curated Data [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Mairal, Julien ;
Joulin, Armand .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2959-2968
[7]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156
[8]   MegDet: A Large Mini-Batch Object Detector [J].
Peng, Chao ;
Xiao, Tete ;
Li, Zeming ;
Jiang, Yuning ;
Zhang, Xiangyu ;
Jia, Kai ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6181-6189
[9]   The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[10]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848