Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient ?

被引：2

作者：

Bouniot, Quentin ^{[1
,2
]}

Loesch, Angelique ^{[1
]}

Audigier, Romaric ^{[1
]}

Habrard, Amaury ^{[2
,3
]}

机构：

[1] Univ Paris Saclay, CEA, LIST, F-91120 Palaiseau, France

[2] Univ Lyon, UJM St Etienne, CNRS, IOGS,Lab Hubert Curien,UMR 5516, F-42023 St Etienne, France

[3] Inst Univ France IUF, Paris, France

来源：

2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年

关键词：

D O I：

10.1109/WACV56688.2023.00016

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For specialized and dense downstream tasks such as object detection, labeling data requires expertise and can be very expensive, making few-shot and semi-supervised models much more attractive alternatives. While in the few-shot setup we observe that transformer-based object detectors perform better than convolution-based two-stage models for a similar amount of parameters, they are not as effective when used with recent approaches in the semi-supervised setting. In this paper, we propose a semi-supervised method tailored for the current state-of-the-art object detector Deformable DETR in the few-annotation learning setup using a student-teacher architecture, which avoids relying on a sensitive post-processing of the pseudo-labels generated by the teacher model. We evaluate our method on the semi-supervised object detection benchmarks COCO and Pascal VOC, and it outperforms previous methods, especially when annotations are scarce. We believe that our contributions open new possibilities to adapt similar object detection methods in this setup as well.

引用

页码：75 / 84

页数：10

共 39 条

[1] [Anonymous], 2006, P 27 ACM SIGKDD C KN
[2] Arazo Eric, 2020, IEEE IJCNN, P1
[3] Berthelot D, 2019, ADV NEUR IN, V32
[4] Soft-NMS - Improving Object Detection With One Line of Code
Bodla, Navaneeth
Singh, Bharat
Chellappa, Rama
Davis, Larry S.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5562 - 5570
[5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6] Emerging Properties in Self-Supervised Vision Transformers
Caron, Mathilde
Touvron, Hugo
Misra, Ishan
Jegou, Herve
Mairal, Julien
Bojanowski, Piotr
Joulin, Armand
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
[7] Chen T., 2020, P ADV NEUR INF PROC, P22243, DOI DOI 10.48550/ARXIV.2006.10029
[8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9] The Pascal Visual Object Classes (VOC) Challenge
Everingham, Mark
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
[10] Fast R-CNN
Girshick, Ross
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448

← 1 2 3 4 →