MonoDFM: Density Field Modeling-Based End-to-End Monocular 3D Object Detection

被引：0

作者：

Liu, Gang ^{[1
]}

Huang, Xinrui ^{[1
]}

Xie, Xiaoxiao ^{[1
]}

机构：

[1] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China

来源：

IEEE ACCESS | 2025年 / 13卷

关键词：

Three-dimensional displays; Solid modeling; Object detection; Computational modeling; Neural radiance field; Accuracy; Feature extraction; Depth measurement; Point cloud compression; Real-time systems; 3D reconstruction; density field modeling; end-to-end detection; monocular 3D object detection; occlusion handling;

D O I：

10.1109/ACCESS.2025.3563248

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Monocular 3D object detection aims to infer the 3D properties of objects from a single RGB image. Existing methods primarily rely on planar features to estimate 3D information directly. However, the limited 3D information available in 2D images often results in suboptimal detection accuracy. To address this challenge, we propose MonoDFM, an end-to-end monocular 3D object detection method based on density field modeling. By modeling the density field from the features of a single image, MonoDFM enables a more accurate transition from 2D to 3D representations, improving 3D attribute prediction accuracy. Unlike traditional depth map methods, which are limited to visible regions, MonoDFM infers geometric information from occluded regions by predicting the scene's density field. Moreover, compared with more complex approaches like Neural Radiance Fields (NeRF), MonoDFM provides a streamlined and efficient prediction process. Experiments conducted on the KITTI dataset show that MonoDFM achieves AP3D of (25.13, 16.61, 13.82) and APBEV of (32.61, 22.14, 18.71) on the KITTI benchmark for the Car category under three difficulty settings (easy, moderate, and hard), achieving competitive performance. Ablation studies further validate the effectiveness of each component. As a result, MonoDFM offers an effective approach to monocular 3D object detection, demonstrating strong performance.

引用

页码：74015 / 74031

页数：17

共 78 条

[1] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].

Brazil, Garrick ;

Liu, Xiaoming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295

[2] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation [J].

Chen, Hansheng ;

Huang, Yuyao ;

Tian, Wei ;

Gao, Zhong ;

Xiong, Lu .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :10374-10383

[3] MonoCAPE: Monocular 3D object detection with coordinate-aware position embeddings [J].

Chen, Wenyu ;

Chen, Mu ;

Fang, Jian ;

Zhao, Huaici ;

Wang, Guogang .

COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120

[4] 3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhu, Yukun ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) :1259-1272

[5] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[6]

Chen XZ, 2015, ADV NEUR IN, V28

[7] Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving [J].

Chen, Yi-Nan ;

Dai, Hang ;

Ding, Yong .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :877-887

[8] DSGN plus plus : Exploiting Visual-Spatial Relation for Stereo-Based 3D Detectors [J].

Chen, Yilun ;

Huang, Shijia ;

Liu, Shu ;

Yu, Bei ;

Jia, Jiaya .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) :4416-4429

[9] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships [J].

Chen, Yongjian ;

Tai, Lei ;

Sun, Kai ;

Li, Mingyang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12090-12099

[10] TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers [J].

Chen, Ziming ;

Shi, Yifeng ;

Jia, Jinrang .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :18159-18168

← 1 2 3 4 5 6 7 8 →