DDet3D: embracing 3D object detector with diffusionDDet3D: embracing 3D object detector with diffusionG.K. Erabati and H. Araujo

被引:0
|
作者
Gopi Krishna Erabati [1 ]
Helder Araujo [1 ]
机构
[1] University of Coimbra,Institute of Systems and Robotics
关键词
3D object detection; Diffusion; LiDAR; Autonomous driving; Computer vision;
D O I
10.1007/s10489-024-06045-1
中图分类号
学科分类号
摘要
Existing approaches rely on heuristic or learnable object proposals (which are required to be optimised during training) for 3D object detection. In our approach, we replace the hand-crafted or learnable object proposals with randomly generated object proposals by formulating a new paradigm to employ a diffusion model to detect 3D objects from a set of randomly generated and supervised learning-based object proposals in an autonomous driving application. We propose DDet3D, a diffusion-based 3D object detection framework that formulates 3D object detection as a generative task over the 3D bounding box coordinates in 3D space. To our knowledge, this work is the first to formulate the 3D object detection with denoising diffusion model and to establish that 3D randomly generated and supervised learning-based proposals (different from empirical anchors or learnt queries) are also potential object candidates for 3D object detection. During training, the 3D random noisy boxes are employed from the 3D ground truth boxes by progressively adding Gaussian noise, and the DDet3D network is trained to reverse the diffusion process. During the inference stage, the DDet3D network is able to iteratively refine the 3D randomly generated and supervised learning-based noisy boxes to predict 3D bounding boxes conditioned on the LiDAR Bird’s Eye View (BEV) features. The advantage of DDet3D is that it allows to decouple training and inference stages, thus enabling the use of a larger number of proposal boxes or sampling steps during inference to improve accuracy. We conduct extensive experiments and analysis on the nuScenes and KITTI datasets. DDet3D achieves competitive performance compared to well-designed 3D object detectors. Our work serves as a strong baseline to explore and employ more efficient diffusion models for 3D perception tasks.
引用
收藏
相关论文
共 50 条
  • [41] Semantic Frustum Based VoxelNet for 3D Object Detection
    Chen, Feng
    Wu, Fei
    Huang, Qinghua
    Feng, Yujian
    Ge, Qi
    Ji, Yimu
    Hu, Chang-Hui
    Jing, Xiao-Yuan
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 7629 - 7634
  • [42] X-View: Non-Egocentric Multi-View 3D Object Detector
    Xie, Liang
    Xu, Guodong
    Cai, Deng
    He, Xiaofei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1488 - 1497
  • [43] Application of uncertainty modeling in 2D and 3D object detection
    Wang M.
    Zhu B.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (08): : 2370 - 2376
  • [44] 3D Object Detection Based on LiDAR Data
    Sahba, Ramin
    Sahba, Amin
    Jamshidi, Mo
    Rad, Paul
    2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2019, : 511 - 514
  • [45] SGF3D: Similarity-guided fusion network for 3D object detection
    Li, Chunzheng
    Wang, Gaihua
    Long, Qian
    Zhou, Zhengshu
    IMAGE AND VISION COMPUTING, 2024, 142
  • [46] MS23D: 2 3D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer
    Shao, Yongxin
    Tan, Aihong
    Yan, Tianhong
    Sun, Zhetao
    Liu, Jiaxin
    NEURAL NETWORKS, 2024, 179
  • [47] 3D Vision in 3D Concrete Printing
    Sokolov, Dmitrii
    Mechtcherine, Viktor
    FOURTH RILEM INTERNATIONAL CONFERENCE ON CONCRETE AND DIGITAL FABRICATION, DC 2024, 2024, 53 : 182 - 189
  • [48] 3D MSSD: A multilayer spatial structure 3D object detection network for mobile LiDAR point clouds
    Wang, Zongyue
    Xia, Qiming
    Du, Jing
    Huang, Shangfeng
    Su, Jinhe
    Marcato Junior, Jose
    Li, Jonathan
    Cai, Guorong
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2021, 102
  • [49] ST3D++: Denoised Self-Training for Unsupervised Domain Adaptation on 3D Object Detection
    Yang, Jihan
    Shi, Shaoshuai
    Wang, Zhe
    Li, Hongsheng
    Qi, Xiaojuan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 6354 - 6371
  • [50] 3D Object Detection for Self-Driving Vehicles Enhanced by Object Velocity
    Alexandrino, Leandro
    Olyaei, Hadi Z.
    Albuquerque, Andre
    Georgieva, Petia
    Drummond, Miguel V.
    IEEE ACCESS, 2024, 12 : 8220 - 8229