Monocular depth estimation based on deep learning: An overview

被引:179
作者
Zhao, ChaoQiang [1 ]
Sun, QiYu [1 ]
Zhang, ChongZhen [1 ]
Tang, Yang [1 ]
Qian, Feng [1 ]
机构
[1] East China Univ Sci & Technol, Key Lab Adv Control & Optimizat Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
基金
中国国家自然科学基金;
关键词
autonomous systems; monocular depth estimation; deep learning; unsupervised learning; RECONSTRUCTION;
D O I
10.1007/s11431-020-1582-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on feature correspondences of multiple viewpoints. Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image (monocular depth estimation) is an ill-posed problem. With the rapid development of deep neural networks, monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy. Meanwhile, dense depth maps are estimated from single images by deep neural networks in an end-to-end manner. In order to improve the accuracy of depth estimation, different kinds of network frameworks, loss functions and training strategies are proposed subsequently. Therefore, we survey the current monocular depth estimation methods based on deep learning in this review. Initially, we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation. Furthermore, we review some representative existing methods according to different training manners: supervised, unsupervised and semi-supervised. Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation.
引用
收藏
页码:1612 / 1627
页数:16
相关论文
共 119 条
[91]  
Wang P, 2015, PROC CVPR IEEE, P2800, DOI 10.1109/CVPR.2015.7298897
[92]   Recurrent Neural Network for (Un-)supervised Learning of Monocular Video Visual Odometry and Depth [J].
Wang, Rui ;
Pizer, Stephen M. ;
Frahm, Jan-Michael .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5647-5656
[93]   UnOS: Unified Unsupervised Optical-flow and Stereo-depth Estimation by Watching Videos [J].
Wang, Yang ;
Wang, Peng ;
Yang, Zhenheng ;
Luo, Chenxu ;
Yang, Yi ;
Xu, Wei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8063-8073
[94]   Image quality assessment: From error visibility to structural similarity [J].
Wang, Z ;
Bovik, AC ;
Sheikh, HR ;
Simoncelli, EP .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (04) :600-612
[95]  
Wolk D, 2019, IEEE INT CONF ROBOT, P6101, DOI [10.1109/ICRA.2019.8794182, 10.1109/icra.2019.8794182]
[96]  
Wu D, 2017, IEEE INT CONF COMM, P851, DOI 10.1109/ICCW.2017.7962765
[97]   Stability Analysis of Stochastic Delayed Systems With an Application to Multi-Agent Systems [J].
Wu, Xiaotai ;
Tang, Yang ;
Zhang, Wenbing .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) :4143-4149
[98]   Spatial Correspondence with Generative Adversarial Network: Learning Depth from Monocular Videos [J].
Wu, Zhenyao ;
Wu, Xinyi ;
Zhang, Xiaoping ;
Wang, Song ;
Ju, Lili .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7493-7503
[99]   Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks [J].
Xie, Junyuan ;
Girshick, Ross ;
Farhadi, Ali .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :842-857
[100]   Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation [J].
Xu, Dan ;
Wang, Wei ;
Tang, Hao ;
Liu, Hong ;
Sebe, Nicu ;
Ricci, Elisa .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3917-3925