Vehicle 3D Space Detection Method Based on Monocular Vision

被引:0
|
作者
Gu D.-Y. [1 ]
Zhang S. [1 ]
Meng F.-W. [1 ]
机构
[1] School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao
关键词
Computer vision; Deep learning; Feature fusion; KITTI; Monocular vehicle detection;
D O I
10.12068/j.issn.1005-3026.2022.03.004
中图分类号
学科分类号
摘要
Aiming at the problem of low detection precision of 3D bounding box based on monocular vehicle detection, a new network method based on improved FPN (feature pyramid networks) feature fusion, ResNet residual unit, and fully connected layer was proposed. In the training phase, the three-dimensional size of vehicles, residual angle and confidence are regressed. In the reasoning phase, the three-dimensional size and local angle(α)of vehicles are detected. The 3D bounding box of vehicles are reconstructed and drawn from the center coordinates, the three-dimensional size of vehicles, the yaw angle(θ), and the camera intrinsic matrix. The proposed method is tested on the KITTI verification set. Compared with the results of the original method, the proposed method improves the average precision of 3D bounding box of vehicles(AP3D)to 0.60%, 1.37%, and 1.41%, respectively, under the three detection levels of easy, moderate and difficult. © 2022, Editorial Department of Journal of Northeastern University. All right reserved.
引用
收藏
页码:328 / 334
页数:6
相关论文
共 15 条
  • [1] Zhang Peng, Song Yi-fan, Zong Li-bo, Et al., Review of 3D object detection, Computer Science, 47, 4, pp. 94-102, (2020)
  • [2] Chen X, Kundu K, Zhang Z, Et al., Monocular 3D object detection for autonomous driving, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147-2156, (2016)
  • [3] Pavlakos G, Zhou X, Chan A, Et al., 6-DoF object pose from semantic keypoints, 2017 IEEE International Conference on Robotics and Automation(ICRA), pp. 2011-2018, (2017)
  • [4] Roddick T, Kendall A, Cipolla R, Et al., Orthographic feature transform for monocular 3D object detection, (2018)
  • [5] Brazil G, Liu X., M3d-rpn: monocular 3D region proposal network for object detection, Proceedings of the IEEE International Conference on Computer Vision, pp. 9287-9296, (2019)
  • [6] Weng X, Kitani K., Monocular 3D object detection with pseudo-lidar point cloud, International Conference on Computer Vision Workshops, pp. 857-866, (2019)
  • [7] Liu Z, Wu Z, Toth R., Smoke: single-stage monocular 3d object detection via key-point-estimation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 996-997, (2020)
  • [8] Mousavian A, Anguelov D, Flynn J, Et al., 3D bounding box estimation using deep learning and geometry, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7074-7082, (2017)
  • [9] Simonyan K, Zisserman A., Very deep convolutional networks for large-scale image recognition[J], (2014)
  • [10] Lin T Y, Dollar P, Girshick R, Et al., Feature pyramid networks for object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117-2125, (2017)