Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos

被引:0
作者
Zhao, Haimei [1 ,3 ]
Bian, Wei [2 ]
Yuan, Bo [1 ]
Tao, Dacheng [3 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] Univ Technol Sydney, Ctr Artificial Intelligence, Sydney, NSW, Australia
[3] Univ Sydney, Fac Engn, Sch CS, UBTECH Sydney AI Ctr, Sydney, NSW, Australia
来源
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene perceiving and understanding tasks including depth estimation, visual odometry (VO) and camera relocalization are fundamental for applications such as autonomous driving, robots and drones. Driven by the power of deep learning, significant progress has been achieved on individual tasks but the rich correlations among the three tasks are largely neglected. In previous studies, VO is generally accurate in local scope yet suffers from drift in long distances. By contrast, camera relocalization performs well in the global sense but lacks local precision. We argue that these two tasks should be strategically combined to leverage the complementary advantages, and be further improved by exploiting the 3D geometric information from depth data, which is also beneficial for depth estimation in turn. Therefore, we present a collaborative learning framework, consisting of DepthNet, LocalPoseNet and GlobalPoseNet with a joint optimization loss to estimate depth, VO and camera localization unitedly. Moreover, the Geometric Attention Guidance Model is introduced to exploit the geometric relevance among three branches during learning. Extensive experiments demonstrate that the joint learning scheme is useful for all tasks and our method outperforms current state-of-the-art techniques in depth estimation and camera relocalization with highly competitive performance in VO.
引用
收藏
页码:488 / 494
页数:7
相关论文
共 28 条
  • [1] [Anonymous], 2017, P IEEE INT C COMPUTE
  • [2] Bing Wang, 2019, ARXIV PREPRINT ARXIV
  • [3] Geometry-Aware Learning of Maps for Camera Localization
    Brahmbhatt, Samarth
    Gu, Jinwei
    Kim, Kihwan
    Hays, James
    Kautz, Jan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2616 - 2625
  • [4] Casser V, 2019, AAAI CONF ARTIF INTE, P8001
  • [5] Eigen D, 2014, ADV NEUR IN, V27
  • [6] Visual Odometry Part II: Matching, Robustness, Optimization, and Applications
    Fraundorfer, Friedrich
    Scaramuzza, Davide
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2012, 19 (02) : 78 - 90
  • [7] Deep Ordinal Regression Network for Monocular Depth Estimation
    Fu, Huan
    Gong, Mingming
    Wang, Chaohui
    Batmanghelich, Kayhan
    Tao, Dacheng
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
  • [8] Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
    Garg, Ravi
    VijayKumar, B. G.
    Carneiro, Gustavo
    Reid, Ian
    [J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 740 - 756
  • [9] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
  • [10] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837