ENG: End-to-End Neural Geometry for Robust Depth and Pose Estimation Using CNNs

被引:6
作者
Dharmasiri, Thanuja [1 ]
Spek, Andrew [1 ]
Drummond, Tom [1 ]
机构
[1] Monash Univ, Melbourne, Vic, Australia
来源
COMPUTER VISION - ACCV 2018, PT I | 2019年 / 11361卷
基金
澳大利亚研究理事会;
关键词
Depth; Optical flow; Pose prediction; Indoor and outdoor datasets;
D O I
10.1007/978-3-030-20887-5_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recovering structure and motion parameters given a image pair or a sequence of images is a well studied problem in computer vision. This is often achieved by employing Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) algorithms based on the real-time requirements. Recently, with the advent of Convolutional Neural Networks (CNNs) researchers have explored the possibility of using machine learning techniques to reconstruct the 3D structure of a scene and jointly predict the camera pose. In this work, we present a framework that achieves state-of-the-art performance on single image depth prediction for both indoor and outdoor scenes. The depth prediction system is then extended to predict optical flow and ultimately the camera pose and trained end-to-end. Our framework outperforms previous deep-learning based motion prediction approaches, and we also demonstrate that the state-of-the-art metric depths can be further improved using the knowledge of pose.
引用
收藏
页码:625 / 642
页数:18
相关论文
共 41 条
[11]  
[Anonymous], 2016, INT C 3D VIS 3DV
[12]  
[Anonymous], 2007, IEEE ACM INT S MIX A
[13]  
[Anonymous], ARXIV170607593
[14]  
[Anonymous], 2017, IEEE C COMP VIS PATT
[15]  
[Anonymous], 2012, INT C INT ROB SYST I
[16]  
[Anonymous], ACM T GRAPHICS
[17]  
[Anonymous], IEEE C COMP VIS PATT
[18]   Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].
Eigen, David ;
Fergus, Rob .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658
[19]   LSD-SLAM: Large-Scale Direct Monocular SLAM [J].
Engel, Jakob ;
Schoeps, Thomas ;
Cremers, Daniel .
COMPUTER VISION - ECCV 2014, PT II, 2014, 8690 :834-849
[20]   Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue [J].
Garg, Ravi ;
VijayKumar, B. G. ;
Carneiro, Gustavo ;
Reid, Ian .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :740-756