DETReg: Unsupervised Pretraining with Region Priors for Object Detection

被引：41

作者：

Bar, Amir ^{[1
]}

Wang, Xin ^{[5
]}

Kantorov, Vadim ^{[1
]}

Reed, Colorado J. ^{[2
]}

Herzig, Roei ^{[1
]}

Chechik, Gal ^{[3
,4
]}

Rohrbach, Anna ^{[2
]}

Darrell, Trevor ^{[2
]}

Globerson, Amir ^{[1
]}

机构：

[1] Tel Aviv Univ, Tel Aviv, Israel

[2] Berkeley AI Res, Berkeley, CA USA

[3] NVIDIA, Santa Clara, CA USA

[4] Bar Ilan Univ, Ramat Gan, Israel

[5] Microsoft Res, Redmond, WA USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/CVPR52688.2022.01420

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture. Instead, we introduce DETReg, a new self-supervised method that pretrains the entire object detection network, including the object localization and embedding components. During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator and simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder. We implement DETReg using the DETR family of detectors and show that it improves over competitive baselines when finetuned on COCO, PASCAL VOC, and Airbus Ship benchmarks. In low-data regimes, including semi-supervised and few-shot learning settings, DETReg establishes many state-of-the-art results, e.g., on COCO we see a +6.0 AP improvement for 10-shot detection and over 2 AP improvements when training with only 1% of the labels.(1)

引用

页码：14585 / 14595

页数：11

共 71 条

[1] Airbus, AIRB SHIP DET CHALL, V2, P5
[2] Alexe B, 2010, PROC CVPR IEEE, P73, DOI 10.1109/CVPR.2010.5540226
[3] Multiscale Combinatorial Grouping
Arbelaez, Pablo
Pont-Tuset, Jordi
Barron, Jonathan T.
Marques, Ferran
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 328 - 335
[4] Bradski G, 2000, DR DOBBS J, V25, P120
[5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6] Caron M, 2020, ADV NEUR IN, V33
[7] Emerging Properties in Self-Supervised Vision Transformers
Caron, Mathilde
Touvron, Hugo
Misra, Ishan
Jegou, Herve
Mairal, Julien
Bojanowski, Piotr
Joulin, Armand
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
[8] CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts
Carreira, Joao
Sminchisescu, Cristian
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (07) : 1312 - 1328
[9] Chen H, 2018, AAAI CONF ARTIF INTE, P2836
[10] Chen T, 2020, Arxiv, DOI arXiv:2002.05709

← 1 2 3 4 5 6 7 8 →