MonoDFM: Density Field Modeling-Based End-to-End Monocular 3D Object Detection

被引:0
作者
Liu, Gang [1 ]
Huang, Xinrui [1 ]
Xie, Xiaoxiao [1 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China
关键词
Three-dimensional displays; Solid modeling; Object detection; Computational modeling; Neural radiance field; Accuracy; Feature extraction; Depth measurement; Point cloud compression; Real-time systems; 3D reconstruction; density field modeling; end-to-end detection; monocular 3D object detection; occlusion handling;
D O I
10.1109/ACCESS.2025.3563248
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Monocular 3D object detection aims to infer the 3D properties of objects from a single RGB image. Existing methods primarily rely on planar features to estimate 3D information directly. However, the limited 3D information available in 2D images often results in suboptimal detection accuracy. To address this challenge, we propose MonoDFM, an end-to-end monocular 3D object detection method based on density field modeling. By modeling the density field from the features of a single image, MonoDFM enables a more accurate transition from 2D to 3D representations, improving 3D attribute prediction accuracy. Unlike traditional depth map methods, which are limited to visible regions, MonoDFM infers geometric information from occluded regions by predicting the scene's density field. Moreover, compared with more complex approaches like Neural Radiance Fields (NeRF), MonoDFM provides a streamlined and efficient prediction process. Experiments conducted on the KITTI dataset show that MonoDFM achieves AP3D of (25.13, 16.61, 13.82) and APBEV of (32.61, 22.14, 18.71) on the KITTI benchmark for the Car category under three difficulty settings (easy, moderate, and hard), achieving competitive performance. Ablation studies further validate the effectiveness of each component. As a result, MonoDFM offers an effective approach to monocular 3D object detection, demonstrating strong performance.
引用
收藏
页码:74015 / 74031
页数:17
相关论文
共 79 条
[31]  
Li C., 2025, P 13 INT C LEARN REP, P1
[32]   Stereo R-CNN based 3D Object Detection for Autonomous Driving [J].
Li, Peiliang ;
Chen, Xiaozhi ;
Shen, Shaojie .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7636-7644
[33]   MonoLSS: Learnable Sample Selection For Monocular 3D Detection [J].
Li, Zhenjia ;
Jia, Jinrang ;
Shi, Yifeng .
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, :1125-1135
[34]   Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection [J].
Li, Zhuoling ;
Qu, Zhan ;
Zhou, Yang ;
Liu, Jianzhuang ;
Wang, Haoqian ;
Jiang, Lihui .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2781-2790
[35]   Self-supervised 3D vehicle detection based on monocular images [J].
Liu, He ;
Sun, Yi .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2024, 127
[36]   3D Object Detection with Fusion Point Attention Mechanism in LiDAR Point Cloud [J].
Liu Weili ;
Zhu Deli ;
Luo Huahao ;
Li Yi .
ACTA PHOTONICA SINICA, 2023, 52 (09)
[37]  
Liu XP, 2022, AAAI CONF ARTIF INTE, P1810
[38]   SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation [J].
Liu, Zechen ;
Wu, Zizhang ;
Toth, Roland .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :4289-4298
[39]   AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection [J].
Liu, Zongdai ;
Zhou, Dingfu ;
Lu, Feixiang ;
Fang, Jin ;
Zhang, Liangjun .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :15621-15630
[40]   Geometry Uncertainty Projection Network for Monocular 3D Object Detection [J].
Lu, Yan ;
Ma, Xinzhu ;
Yang, Lei ;
Zhang, Tianzhu ;
Liu, Yating ;
Chu, Qi ;
Yan, Junjie ;
Ouyang, Wanli .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3091-3101