End-to-End Object-Level Contrastive Pretraining for Detection via Semantic-Aware Localization

被引：0

作者：

Geng, Long ^{[1
]}

Huang, Xiaoming ^{[1
]}

机构：

[1] Beijing Informat Sci & Technol Univ, Comp Sch, Beijing 100029, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I | 2024年 / 14473卷

关键词：

Object Detection; Self-Supervised Learning; Semantic-aware Localization; Contrastive Learning; Pretraining;

D O I：

10.1007/978-981-99-8850-1_24

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pretraining on a large dataset is the first stage of many computer vision tasks such as classification, detection, and segmentation. A conventional pretraining approach is performed on large datasets with human annotation. In this context, self-supervised learning, which uses unlabeled datasets to pretrain models, shows increasing promise for applications. Throughout the development of self-supervised learning, image-level contrastive representation learning has emerged as a highly effective approach for general transfer learning. However, it may lack specificity when applied to a specific downstream task, compromising performance in that task. Recently, an object-level self-supervised pretraining framework called SoCo is proposed for object detection. To achieve object-level pretraining, they adopt the traditional selective search algorithm to generate object proposals, which needs high space and time cost and also hinders end-to-end training to achieve global optimization. In this work, we propose an end-to-end object-level contrastive pretraining for detection, which obtains object proposals using the pretraining network itself. Specifically, we adopt the heat map from the features at the last backbone convolutional layer as semantic information to roughly localize objects, then generate promised proposals with center-suppressed sampling and multiple cropping strategies. The experimental results show that our method displays better performance with significantly less training space and time cost.

引用

页码：293 / 304

页数：12