DETReg: Unsupervised Pretraining with Region Priors for Object Detection

被引:41
作者
Bar, Amir [1 ]
Wang, Xin [5 ]
Kantorov, Vadim [1 ]
Reed, Colorado J. [2 ]
Herzig, Roei [1 ]
Chechik, Gal [3 ,4 ]
Rohrbach, Anna [2 ]
Darrell, Trevor [2 ]
Globerson, Amir [1 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] Berkeley AI Res, Berkeley, CA USA
[3] NVIDIA, Santa Clara, CA USA
[4] Bar Ilan Univ, Ramat Gan, Israel
[5] Microsoft Res, Redmond, WA USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
欧洲研究理事会;
关键词
D O I
10.1109/CVPR52688.2022.01420
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture. Instead, we introduce DETReg, a new self-supervised method that pretrains the entire object detection network, including the object localization and embedding components. During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator and simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder. We implement DETReg using the DETR family of detectors and show that it improves over competitive baselines when finetuned on COCO, PASCAL VOC, and Airbus Ship benchmarks. In low-data regimes, including semi-supervised and few-shot learning settings, DETReg establishes many state-of-the-art results, e.g., on COCO we see a +6.0 AP improvement for 10-shot detection and over 2 AP improvements when training with only 1% of the labels.(1)
引用
收藏
页码:14585 / 14595
页数:11
相关论文
共 71 条
  • [1] Airbus, AIRB SHIP DET CHALL, V2, P5
  • [2] Alexe B, 2010, PROC CVPR IEEE, P73, DOI 10.1109/CVPR.2010.5540226
  • [3] Multiscale Combinatorial Grouping
    Arbelaez, Pablo
    Pont-Tuset, Jordi
    Barron, Jonathan T.
    Marques, Ferran
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 328 - 335
  • [4] Bradski G, 2000, DR DOBBS J, V25, P120
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] Caron M, 2020, ADV NEUR IN, V33
  • [7] Emerging Properties in Self-Supervised Vision Transformers
    Caron, Mathilde
    Touvron, Hugo
    Misra, Ishan
    Jegou, Herve
    Mairal, Julien
    Bojanowski, Piotr
    Joulin, Armand
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
  • [8] CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts
    Carreira, Joao
    Sminchisescu, Cristian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (07) : 1312 - 1328
  • [9] Chen H, 2018, AAAI CONF ARTIF INTE, P2836
  • [10] Chen T, 2020, Arxiv, DOI arXiv:2002.05709