Don't Forget The Past: Recurrent Depth Estimation from Monocular Video

被引：95

作者：

Patil, Vaishakh ^{[1
]}

Van Gansbeke, Wouter ^{[2
]}

Dai, Dengxin ^{[1
]}

Van Gool, Luc ^{[1
,2
]}

机构：

[1] Swiss Fed Inst Technol, TRACE Zurich, Comp Vis Lab, CH-8092 Zurich, Switzerland

[2] Katholieke Univ Leuven, Toyota TRACE Leuven, Dept Elect Engn ESAT, B-3001 Leuven, Belgium

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2020年 / 5卷 / 04期

关键词：

Deep learning for visual perception; RGBD perception; sensor fusion; novel deep learning methods; autonomous vehicle navigation; PREDICTION;

D O I：

10.1109/LRA.2020.3017478

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Autonomous cars need continuously updated depth information. Thus far, depth is mostly estimated independently for a single frame at a time, even if themethod starts fromvideo input. Our method produces a time series of depth maps, which makes it an ideal candidate for online learning approaches. In particular, we put three different types of depth estimation (supervised depth prediction, self-supervised depth prediction, and self-supervised depth completion) into a common framework. We integrate the corresponding networks with a ConvLSTM such that the spatiotemporal structures of depth across frames can be exploited to yield a more accurate depth estimation. Our method is flexible. It can be applied to monocular videos only or be combined with different types of sparse depth patterns. We carefully study the architecture of the recurrent network and its training strategy. We are first to successfully exploit recurrent networks for real-time self-supervised monocular depth estimation and completion. Extensive experiments show that our recurrent method outperforms its image-based counterpart consistently and significantly in both self-supervised scenarios. It also outperforms previous depth estimation methods of the three popular groups. Please refer to our webpage(1) for details.

引用

页码：6813 / 6820

页数：8

共 46 条

[1] Learning to See by Moving [J].

Agrawal, Pulkit ;

Carreira, Joao ;

Malik, Jitendra .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :37-45

[2]

Casser V, 2019, AAAI CONF ARTIF INTE, P8001

[3] Self-supervised Learning with Geometric Constraints in Monocular Video Connecting Flow, Depth, and Camera [J].

Chen, Yuhua ;

Schmid, Cordelia ;

Sminchisescu, Cristian .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7062-7071

[4] Estimating Depth from RGB and Sparse Sensing [J].

Chen, Zhao ;

Badrinarayanan, Vijay ;

Drozdov, Gilad ;

Rabinovich, Andrew .

COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 :176-192

[5] Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network [J].

Cheng, Xinjing ;

Wang, Peng ;

Yang, Ruigang .

COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :108-125

[6]

Clevert D. A., 2016, CORR

[7]

Eigen D., 2014, Advances in neural information processing systems, P2366, DOI [DOI 10.5555/2969033.2969091, DOI 10.1007/978-3-540-28650-9_5]

[8] Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].

Eigen, David ;

Fergus, Rob .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658

[9]

Eldesokey A., 2019, PAMI, V1, P2

[10] Deep Ordinal Regression Network for Monocular Depth Estimation [J].

Fu, Huan ;

Gong, Mingming ;

Wang, Chaohui ;

Batmanghelich, Kayhan ;

Tao, Dacheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2002-2011

← 1 2 3 4 5 →