Enforcing geometric constraints of virtual normal for depth prediction

被引:342
作者
Yin, Wei [1 ]
Liu, Yifan [1 ]
Shen, Chunhua [1 ]
Yan, Youliang [2 ]
机构
[1] Univ Adelaide, Adelaide, SA, Australia
[2] Huawei Technol, Noahs Ark Lab, Shenzhen, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00578
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monocular depth prediction plays a crucial role in understanding 3D scene geometry. Although recent methods have achieved impressive progress in evaluation metrics such as the pixel-wise relative error, most methods neglect the geometric constraints in the 3D space. In this work, we show the importance of the high-order 3D geometric constraints for depth prediction. By designing a loss term that enforces one simple type of geometric constraints, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space, we can considerably improve the depth prediction accuracy. Significantly, the byproduct of this predicted depth being sufficiently accurate is that we are now able to recover good 3D structures of the scene such as the point cloud and surface normal directly from the depth, eliminating the necessity of training new sub-models as was previously done. Experiments on two benchmarks: NYU Depth-V2 and KITTI demonstrate the effectiveness of our method and state-of-the-art performance. Code is available at: https://tinyurl.com/virtualnormal
引用
收藏
页码:5683 / 5692
页数:10
相关论文
共 50 条
  • [1] Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
    Bansal, Aayush
    Russell, Bryan
    Gupta, Abhinav
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5965 - 5974
  • [2] GMS: Grid-based Motion Statistics for Fast, Ultra-robust Feature Correspondence
    Bian, JiaWang
    Lin, Wen-Yan
    Matsushita, Yasuyuki
    Yeung, Sai-Kit
    Nguyen, Tan-Dat
    Cheng, Ming-Ming
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2828 - 2837
  • [3] Cao Yuanzhouhan, 2017, IEEE TRANS CIRCUITS
  • [4] Chakrabarti A., 2016, ARXIV160507081
  • [5] Chen Richard, 2018, ABS180807528 ARXIV C
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] Eigen D, 2014, ADV NEUR IN, V27
  • [8] Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture
    Eigen, David
    Fergus, Rob
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2650 - 2658
  • [9] Fei Xiaohan, 2018, ABS180711130 ARXIV C
  • [10] Data-Driven 3D Primitives for Single Image Understanding
    Fouhey, David F.
    Gupta, Abhinav
    Hebert, Martial
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3392 - 3399