Plain-Det: A Plain Multi-dataset Object Detector

被引:0
作者
Shi, Cheng [1 ]
Zhu, Yuchen [1 ]
Yang, Sibei [1 ]
机构
[1] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
来源
COMPUTER VISION - ECCV 2024, PT V | 2025年 / 15063卷
基金
中国国家自然科学基金;
关键词
Object detection; Multiple datasets; Proposal generation;
D O I
10.1007/978-3-031-72652-1_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in large-scale foundational models have sparked widespread interest in training highly proficient large vision models. A common consensus revolves around the necessity of aggregating extensive, high-quality annotated data. However, given the inherent challenges in annotating dense tasks in computer vision, such as object detection and segmentation, a practical strategy is to combine and leverage all available data for training purposes. In this work, we propose Plain-Det, which offers flexibility to accommodate new datasets, robustness in performance across diverse datasets, training efficiency, and compatibility with various detection architectures. We utilize Def-DETR, with the assistance of Plain-Det, to achieve a mAP of 51.9 on COCO, matching the current state-of-the-art detectors. We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability. Code is release at https://github.com/ChengShiest/Plain-Det.
引用
收藏
页码:210 / 226
页数:17
相关论文
共 45 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts [J].
Changpinyo, Soravit ;
Sharma, Piyush ;
Ding, Nan ;
Soricut, Radu .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :3557-3567
[3]  
Chen Q, 2023, Arxiv, DOI arXiv:2207.13085
[4]   ScaleDet: A Scalable Multi-Dataset Object Detector [J].
Chen, Yanbei ;
Wang, Manchen ;
Mittal, Abhay ;
Xu, Zhenlin ;
Favaro, Paolo ;
Tighe, Joseph ;
Modolo, Davide .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7288-7297
[5]   Dynamic Convolution: Attention over Convolution Kernels [J].
Chen, Yinpeng ;
Dai, Xiyang ;
Liu, Mengchen ;
Chen, Dongdong ;
Yuan, Lu ;
Liu, Zicheng .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11027-11036
[6]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]  
Gu X., 2021, arXiv
[9]   LVIS: A Dataset for Large Vocabulary Instance Segmentation [J].
Gupta, Agrim ;
Dollar, Piotr ;
Girshick, Ross .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5351-5359
[10]  
He KM, 2018, Arxiv, DOI [arXiv:1703.06870, 10.48550/arXiv.1703.06870, DOI 10.48550/ARXIV.1703.06870]