AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

被引:67
作者
Liu, Zongdai [1 ]
Zhou, Dingfu [1 ]
Lu, Feixiang [1 ]
Fang, Jin [1 ]
Zhang, Liangjun [1 ]
机构
[1] Baidu Res, Natl Engn Lab Deep Learning Technol & Applicat, Robot & Autonomous Driving Lab, Beijing, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
ACCURATE;
D O I
10.1109/ICCV48922.2021.01535
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing deep learning-based approaches for monocular 3D object detection in autonomous driving often model the object as a rotated 3D cuboid while the object's geometric shape has been ignored. In this work, we propose an approach for incorporating the shape-aware 2D/3D constraints into the 3D detection framework. Specifically, we employ the deep neural network to learn distinguished 2D keypoints in the 2D image domain and regress their corresponding 3D coordinates in the local 3D object coordinate first. Then the 2D/3D geometric constraints are built by these correspondences for each object to boost the detection performance. For generating the ground truth of 2D/3D keypoints, an automatic model-fitting approach has been proposed by fitting the deformed 3D object model and the object mask in the 2D image. The proposed framework has been verified on the public KITTI dataset and the experimental results demonstrate that by using additional geometrical constraints the detection performance has been significantly improved as compared to the baseline method. More importantly, the proposed framework achieves state-of-the-art performance with real time. Data and code will be available at https://github.com/ zongdai/AutoShape
引用
收藏
页码:15621 / 15630
页数:10
相关论文
共 50 条
  • [1] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
    Brazil, Garrick
    Liu, Xiaoming
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9286 - 9295
  • [2] Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image
    Chabot, Florian
    Chaouch, Mohamed
    Rabarisoa, Jaonary
    Teuliere, Celine
    Chateau, Thierry
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1827 - 1836
  • [3] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
    Chen, Hansheng
    Huang, Yuyao
    Tian, Wei
    Gao, Zhong
    Xiong, Lu
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10374 - 10383
  • [4] Fused behavior recognition model based on attention mechanism
    Chen, Lei
    Liu, Rui
    Zhou, Dongsheng
    Yang, Xin
    Zhang, Qiang
    [J]. VISUAL COMPUTING FOR INDUSTRY BIOMEDICINE AND ART, 2020, 3 (01)
  • [5] Chen Yinpeng, 2020, P IEEE CVF C COMP VI
  • [6] Geiger A., 2012, CVPR, V5, P7
  • [7] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
  • [8] The first high-resolution meteorological forcing dataset for land process studies over China
    He, Jie
    Yang, Kun
    Tang, Wenjun
    Lu, Hui
    Qin, Jun
    Chen, Yingying
    Li, Xin
    [J]. SCIENTIFIC DATA, 2020, 7 (01)
  • [9] Phase separation and surface segregation in Co-Au-SrTiO3 thin films: Self-assembly of bilayered epitaxial nanocolumnar composites
    Hennes, M.
    Weng, X.
    Fonda, E.
    Gallas, B.
    Patriarche, G.
    Demaille, D.
    Zheng, Y.
    Vidal, F.
    [J]. PHYSICAL REVIEW MATERIALS, 2019, 3 (03)
  • [10] Accurate and efficient stereo processing by semi-global matching and mutual information
    Hirschmüller, H
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2005, : 807 - 814