DDet3D: embracing 3D object detector with diffusionDDet3D: embracing 3D object detector with diffusionG.K. Erabati and H. Araujo

被引：0

作者：

Gopi Krishna Erabati ^{[1
]}

Helder Araujo ^{[1
]}

机构：

[1] University of Coimbra,Institute of Systems and Robotics

来源：

Applied Intelligence | 2025年 / 55卷 / 4期

关键词：

3D object detection; Diffusion; LiDAR; Autonomous driving; Computer vision;

D O I：

10.1007/s10489-024-06045-1

中图分类号：

学科分类号：

摘要：

Existing approaches rely on heuristic or learnable object proposals (which are required to be optimised during training) for 3D object detection. In our approach, we replace the hand-crafted or learnable object proposals with randomly generated object proposals by formulating a new paradigm to employ a diffusion model to detect 3D objects from a set of randomly generated and supervised learning-based object proposals in an autonomous driving application. We propose DDet3D, a diffusion-based 3D object detection framework that formulates 3D object detection as a generative task over the 3D bounding box coordinates in 3D space. To our knowledge, this work is the first to formulate the 3D object detection with denoising diffusion model and to establish that 3D randomly generated and supervised learning-based proposals (different from empirical anchors or learnt queries) are also potential object candidates for 3D object detection. During training, the 3D random noisy boxes are employed from the 3D ground truth boxes by progressively adding Gaussian noise, and the DDet3D network is trained to reverse the diffusion process. During the inference stage, the DDet3D network is able to iteratively refine the 3D randomly generated and supervised learning-based noisy boxes to predict 3D bounding boxes conditioned on the LiDAR Bird’s Eye View (BEV) features. The advantage of DDet3D is that it allows to decouple training and inference stages, thus enabling the use of a larger number of proposal boxes or sampling steps during inference to improve accuracy. We conduct extensive experiments and analysis on the nuScenes and KITTI datasets. DDet3D achieves competitive performance compared to well-designed 3D object detectors. Our work serves as a strong baseline to explore and employ more efficient diffusion models for 3D perception tasks.

引用

共 50 条

[31] 3D Object Detection for Autonomous Driving: A Survey
Qian, Rui
Lai, Xin
Li, Xirong
PATTERN RECOGNITION, 2022, 130
[32] 3D Object Tracking Using Disparity Map
Khan, Wasim Akram
Pant, Dibakar Raj
Adhikari, Bhisma
Manandhar, Rasana
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 1015 - 1018
[33] 3D object Classification using Bounding box
Malwe, Gauri
Kshirsagar, Deepak
Madkaikar, Ashish
2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
[34] Fully Sparse Fusion for 3D Object Detection
Li, Yingyan
Fan, Lue
Liu, Yang
Huang, Zehao
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231
[35] Radar Voxel Fusion for 3D Object Detection
Nobis, Felix
Shafiei, Ehsan
Karle, Phillip
Betz, Johannes
Lienkamp, Markus
APPLIED SCIENCES-BASEL, 2021, 11 (12):
[36] Uncertainty Characterization for 3D Object Detection Algorithms
Ding, Bao Ming
Huangfu, Yixin
Habibi, Saeid
2023 IEEE TRANSPORTATION ELECTRIFICATION CONFERENCE & EXPO, ITEC, 2023,
[37] CHATGPT FOR POINT CLOUD 3D OBJECT PROCESSING
Balado, J.
Nguyen, G.
GEOSPATIAL WEEK 2023, VOL. 10-1, 2023, : 107 - 114
[38] A Probabilistic Framework for 3D Visual Object Representation
Detry, Renaud
Pugeault, Nicolas
Piater, Justus H.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (10) : 1790 - 1803
[39] SL3D-Single Look 3D Object Detection based on RGB-D Images
Erabati, Gopi Krishna
Araujo, Helder
2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
[40] Multiple 3D Object Tracking for Augmented Reality
Park, Youngmin
Lepetit, Vincent
Woo, Woontack
7TH IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY 2008, PROCEEDINGS, 2008, : 117 - 120

← 1 2 3 4 5 →