YOLO MDE: Object Detection with Monocular Depth Estimation

被引:11
作者
Yu, Jongsub [1 ]
Choi, Hyukdoo [1 ]
机构
[1] Soonchunhyang Univ, Dept Elect Mat & Devices Engn, Asan 31538, South Korea
关键词
object detection; depth estimation; deep learning;
D O I
10.3390/electronics11010076
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents an object detector with depth estimation using monocular camera images. Previous detection studies have typically focused on detecting objects with 2D or 3D bounding boxes. A 3D bounding box consists of the center point, its size parameters, and heading information. However, predicting complex output compositions leads a model to have generally low performances, and it is not necessary for risk assessment for autonomous driving. We focused on predicting a single depth per object, which is essential for risk assessment for autonomous driving. Our network architecture is based on YOLO v4, which is a fast and accurate one-stage object detector. We added an additional channel to the output layer for depth estimation. To train depth prediction, we extract the closest depth from the 3D bounding box coordinates of ground truth labels in the dataset. Our model is compared with the latest studies on 3D object detection using the KITTI object detection benchmark. As a result, we show that our model achieves higher detection performance and detection speed than existing models with comparable depth accuracy.
引用
收藏
页数:10
相关论文
共 35 条
[21]   Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving [J].
Ma, Xinzhu ;
Wang, Zhihui ;
Li, Haojie ;
Zhang, Pengbo ;
Ouyang, Wanli ;
Fan, Xin .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6850-6859
[22]  
Masoumian Armin, 2021, CCIA, P325
[23]   3D Bounding Box Estimation Using Deep Learning and Geometry [J].
Mousavian, Arsalan ;
Anguelov, Dragomir ;
Flynn, John ;
Kosecka, Jana .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5632-5640
[24]  
Purkait P., 2017, SPP NET DEEP ABSOLUT
[25]   Triangulation Learning Network: from Monocular to Stereo 3D Object Detection [J].
Qin, Zengyi ;
Wang, Jinglu ;
Lu, Yan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7607-7615
[26]  
Redmon J, 2018, Arxiv, DOI arXiv:1804.02767
[27]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[28]   PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection [J].
Shi, Shaoshuai ;
Guo, Chaoxu ;
Jiang, Li ;
Wang, Zhe ;
Shi, Jianping ;
Wang, Xiaogang ;
Li, Hongsheng .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10526-10535
[29]  
Sun Jian, Proceedings of the IEEE conference on computer vision and pattern recognition, P770, DOI DOI 10.1109/CVPR.2016.90
[30]  
Tan MX, 2019, PR MACH LEARN RES, V97