Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos

被引：0

作者：

Zhao, Haimei ^{[1
,3
]}

Bian, Wei ^{[2
]}

Yuan, Bo ^{[1
]}

Tao, Dacheng ^{[3
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Beijing, Peoples R China

[2] Univ Technol Sydney, Ctr Artificial Intelligence, Sydney, NSW, Australia

[3] Univ Sydney, Fac Engn, Sch CS, UBTECH Sydney AI Ctr, Sydney, NSW, Australia

来源：

PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年

基金：

澳大利亚研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Scene perceiving and understanding tasks including depth estimation, visual odometry (VO) and camera relocalization are fundamental for applications such as autonomous driving, robots and drones. Driven by the power of deep learning, significant progress has been achieved on individual tasks but the rich correlations among the three tasks are largely neglected. In previous studies, VO is generally accurate in local scope yet suffers from drift in long distances. By contrast, camera relocalization performs well in the global sense but lacks local precision. We argue that these two tasks should be strategically combined to leverage the complementary advantages, and be further improved by exploiting the 3D geometric information from depth data, which is also beneficial for depth estimation in turn. Therefore, we present a collaborative learning framework, consisting of DepthNet, LocalPoseNet and GlobalPoseNet with a joint optimization loss to estimate depth, VO and camera localization unitedly. Moreover, the Geometric Attention Guidance Model is introduced to exploit the geometric relevance among three branches during learning. Extensive experiments demonstrate that the joint learning scheme is useful for all tasks and our method outperforms current state-of-the-art techniques in depth estimation and camera relocalization with highly competitive performance in VO.

引用

页码：488 / 494

页数：7

共 28 条

[1] [Anonymous], 2017, P IEEE INT C COMPUTE
[2] Bing Wang, 2019, ARXIV PREPRINT ARXIV
[3] Geometry-Aware Learning of Maps for Camera Localization
Brahmbhatt, Samarth
Gu, Jinwei
Kim, Kihwan
Hays, James
Kautz, Jan
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2616 - 2625
[4] Casser V, 2019, AAAI CONF ARTIF INTE, P8001
[5] Eigen D, 2014, ADV NEUR IN, V27
[6] Visual Odometry Part II: Matching, Robustness, Optimization, and Applications
Fraundorfer, Friedrich
Scaramuzza, Davide
[J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2012, 19 (02) : 78 - 90
[7] Deep Ordinal Regression Network for Monocular Depth Estimation
Fu, Huan
Gong, Mingming
Wang, Chaohui
Batmanghelich, Kayhan
Tao, Dacheng
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
[8] Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
Garg, Ravi
VijayKumar, B. G.
Carneiro, Gustavo
Reid, Ian
[J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 740 - 756
[9] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[10] Digging Into Self-Supervised Monocular Depth Estimation
Godard, Clement
Mac Aodha, Oisin
Firman, Michael
Brostow, Gabriel
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837

← 1 2 3 →