DiffusionDet: Diffusion Model for Object Detection

被引：240

作者：

Chen, Shoufa ^{[1
]}

Sun, Peize ^{[1
]}

Song, Yibing ^{[2
,3
]}

Luo, Ping ^{[1
]}

机构：

[1] Univ Hong Kong, Hong Kong, Peoples R China

[2] Tencent AI Lab, Shenzhen, Peoples R China

[3] Fudan Univ, AI3 Inst, Shanghai, Peoples R China

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

基金：

国家重点研发计划;

关键词：

D O I：

10.1109/ICCV51070.2023.01816

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. Our work possesses an appealing property of flexibility, which enables the dynamic number of boxes and iterative evaluation. The extensive experiments on the standard benchmarks show that DiffusionDet achieves favorable performance compared to previous well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8 AP gains when evaluated with more boxes and iteration steps, under a zero-shot transfer setting from COCO to CrowdHuman. Our code is available at https://github.com/ShoufaChen/DiffusionDet.

引用

页码：19773 / 19786

页数：14

共 33 条

[1]

Amit Tomer, 2021, ARXIV211200390

[2]

Austin J, 2021, ADV NEUR IN

[3] Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].

Cai, Zhaowei ;

Vasconcelos, Nuno .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498

[4] Aerodynamic mechanisms in bio-inspired micro air vehicles: a review in the light of novel compound layouts [J].

Chen, Long ;

Zhang, Yanlai ;

Zhou, Chao ;

Wu, Jianghao .

IET CYBER-SYSTEMS AND ROBOTICS, 2019, 1 (01) :2-12

[5]

Du Yuming, 2021, ARXIV211010239

[6]

Fan W.-C., 2022, Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

[7] AdaMixer: A Fast-Converging Query-Based Object Detector [J].

Gao, Ziteng ;

Wang, Limin ;

Han, Bing ;

Guo, Sheng .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5354-5363

[8]

Ge Z., 2021, ABS210708430 CORR, DOI [10.48550/arXiv.2107.08430, 10.48550/ARXIV.2107.08430]

[9]

Guo ZX, 2022, CHIN CONTR CONF, P3347, DOI 10.23919/CCC55666.2022.9901652

[10]

Ho Jonathan., 2020, P 34 INT C NEURAL IN, P6840

← 1 2 3 4 →