DiffusionDet: Diffusion Model for Object Detection

被引:240
作者
Chen, Shoufa [1 ]
Sun, Peize [1 ]
Song, Yibing [2 ,3 ]
Luo, Ping [1 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Tencent AI Lab, Shenzhen, Peoples R China
[3] Fudan Univ, AI3 Inst, Shanghai, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
基金
国家重点研发计划;
关键词
D O I
10.1109/ICCV51070.2023.01816
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. Our work possesses an appealing property of flexibility, which enables the dynamic number of boxes and iterative evaluation. The extensive experiments on the standard benchmarks show that DiffusionDet achieves favorable performance compared to previous well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8 AP gains when evaluated with more boxes and iteration steps, under a zero-shot transfer setting from COCO to CrowdHuman. Our code is available at https://github.com/ShoufaChen/DiffusionDet.
引用
收藏
页码:19773 / 19786
页数:14
相关论文
共 33 条
[1]  
Amit Tomer, 2021, ARXIV211200390
[2]  
Austin J, 2021, ADV NEUR IN
[3]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[4]   Aerodynamic mechanisms in bio-inspired micro air vehicles: a review in the light of novel compound layouts [J].
Chen, Long ;
Zhang, Yanlai ;
Zhou, Chao ;
Wu, Jianghao .
IET CYBER-SYSTEMS AND ROBOTICS, 2019, 1 (01) :2-12
[5]  
Du Yuming, 2021, ARXIV211010239
[6]  
Fan W.-C., 2022, Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
[7]   AdaMixer: A Fast-Converging Query-Based Object Detector [J].
Gao, Ziteng ;
Wang, Limin ;
Han, Bing ;
Guo, Sheng .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5354-5363
[8]  
Ge Z., 2021, ABS210708430 CORR, DOI [10.48550/arXiv.2107.08430, 10.48550/ARXIV.2107.08430]
[9]  
Guo ZX, 2022, CHIN CONTR CONF, P3347, DOI 10.23919/CCC55666.2022.9901652
[10]  
Ho Jonathan., 2020, P 34 INT C NEURAL IN, P6840