DDet3D: embracing 3D object detector with diffusionDDet3D: embracing 3D object detector with diffusionG.K. Erabati and H. Araujo

被引:0
|
作者
Gopi Krishna Erabati [1 ]
Helder Araujo [1 ]
机构
[1] University of Coimbra,Institute of Systems and Robotics
关键词
3D object detection; Diffusion; LiDAR; Autonomous driving; Computer vision;
D O I
10.1007/s10489-024-06045-1
中图分类号
学科分类号
摘要
Existing approaches rely on heuristic or learnable object proposals (which are required to be optimised during training) for 3D object detection. In our approach, we replace the hand-crafted or learnable object proposals with randomly generated object proposals by formulating a new paradigm to employ a diffusion model to detect 3D objects from a set of randomly generated and supervised learning-based object proposals in an autonomous driving application. We propose DDet3D, a diffusion-based 3D object detection framework that formulates 3D object detection as a generative task over the 3D bounding box coordinates in 3D space. To our knowledge, this work is the first to formulate the 3D object detection with denoising diffusion model and to establish that 3D randomly generated and supervised learning-based proposals (different from empirical anchors or learnt queries) are also potential object candidates for 3D object detection. During training, the 3D random noisy boxes are employed from the 3D ground truth boxes by progressively adding Gaussian noise, and the DDet3D network is trained to reverse the diffusion process. During the inference stage, the DDet3D network is able to iteratively refine the 3D randomly generated and supervised learning-based noisy boxes to predict 3D bounding boxes conditioned on the LiDAR Bird’s Eye View (BEV) features. The advantage of DDet3D is that it allows to decouple training and inference stages, thus enabling the use of a larger number of proposal boxes or sampling steps during inference to improve accuracy. We conduct extensive experiments and analysis on the nuScenes and KITTI datasets. DDet3D achieves competitive performance compared to well-designed 3D object detectors. Our work serves as a strong baseline to explore and employ more efficient diffusion models for 3D perception tasks.
引用
收藏
相关论文
共 50 条
  • [1] DDet3D: embracing 3D object detector with diffusion
    Erabati, Gopi Krishna
    Araujo, Helder
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [2] Density Aware 3D Object Single Stage Detector
    Ning, Jingmei
    Da, Feipeng
    Gai, Shaoyan
    IEEE SENSORS JOURNAL, 2021, 21 (20) : 23108 - 23117
  • [3] GPro3D: Deriving 3D BBox from ground plane in monocular 3D object detection
    Yang, Fan
    Xu, Xinhao
    Chen, Hui
    Guo, Yuchen
    He, Yuwei
    Ni, Kai
    Ding, Guiguang
    NEUROCOMPUTING, 2023, 562
  • [4] Deep Active Learning for Efficient Training of a LiDAR 3D Object Detector
    Feng, Di
    Wei, Xiao
    Rosenbaum, Lars
    Maki, Atsuto
    Dietmayer, Klaus
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 667 - 674
  • [5] SRFDet3D: Sparse Region Fusion based 3D Object Detection
    Erabati, Gopi Krishna
    Araujo, Helder
    NEUROCOMPUTING, 2024, 593
  • [6] Reinforcing LiDAR-Based 3D Object Detection with RGB and 3D Information
    Liu, Wenjian
    Zhou, Yue
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 199 - 209
  • [7] A survey of 3D object detection
    Wei Liang
    Pengfei Xu
    Ling Guo
    Heng Bai
    Yang Zhou
    Feng Chen
    Multimedia Tools and Applications, 2021, 80 : 29617 - 29641
  • [8] Interactive 3D object mining
    Wickel, JD
    Alvarado, P
    Kruger, T
    Kraiss, KF
    ANALYSIS, DESIGN AND EVALUATION OF HUMAN-MACHINE SYSTEMS 2001, 2002, : 147 - 152
  • [9] A survey of 3D object detection
    Liang, Wei
    Xu, Pengfei
    Guo, Ling
    Bai, Heng
    Zhou, Yang
    Chen, Feng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 29617 - 29641
  • [10] Super Sparse 3D Object Detection
    Fan, Lue
    Yang, Yuxue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12490 - 12505