3D Street Object Detection from Monocular Images Using Deep Learning and Depth Information

被引:1
作者
Liu, Wei [1 ,2 ,3 ]
Zhang, Tao [1 ,2 ,3 ]
Ma, Yun [1 ,2 ,3 ]
Wei, Longsheng [1 ,2 ,3 ]
机构
[1] China Univ Geosci, Sch Automat, 388 Lumo Rd, Wuhan 430074, Hubei, Peoples R China
[2] Hubei Key Lab Adv Control & Intelligent Automat Co, 388 Lumo Rd, Wuhan 430074, Hubei, Peoples R China
[3] Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, 388 Lumo Rd, Wuhan 430074, Hubei, Peoples R China
关键词
3D detection; monocular image; deep learning; street object;
D O I
10.20965/jaciii.2023.p0198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we present a three-dimensional (3D) ob-ject detection algorithm based on monocular images by constructing an end-to-end network, that incorpo-rates depth information. The entire network consists of three parts. The first part includes the basic ob-ject detection neural network as the main body, that uses the region proposal network to obtain the two-dimensional (2D) region proposal of the object. The second part is the depth estimation branch network, that obtains the depth information of the object pix-els and calculates the corresponding 3D point cloud. In the last part, concatenated features obtained from the aforementioned two parts are fed into the fully -connected layers. Subsequently, 2D and 3D detection results are obtained. Compared with certain existing methods, the accuracy of the detection results is im-proved in this study.
引用
收藏
页码:198 / 206
页数:9
相关论文
共 30 条
[1]  
Boureau Y-Lan, 2010, Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML), P111
[2]   Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [J].
Chabot, Florian ;
Chaouch, Mohamed ;
Rabarisoa, Jaonary ;
Teuliere, Celine ;
Chateau, Thierry .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1827-1836
[3]   3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhu, Yukun ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) :1259-1272
[4]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[5]   Monocular 3D Object Detection for Autonomous Driving [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhang, Ziyu ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156
[6]   DSGN: Deep Stereo Geometry Network for 3D Object Detection [J].
Chen, Yilun ;
Liu, Shu ;
Shen, Xiaoyong ;
Jia, Jiaya .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12533-12542
[7]  
Dai JF, 2016, ADV NEUR IN, V29
[8]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[9]   Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Brostow, Gabriel J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611
[10]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916