LocalBins: Improving Depth Estimation by Learning Local Distributions

被引:59
作者
Bhat, Shariq Farooq [1 ]
Alhashim, Ibraheem [2 ]
Wonka, Peter [1 ]
机构
[1] KAUST, Thuwal, Saudi Arabia
[2] Saudi Data & Artificial Intelligence Authority SD, Natl Ctr Artificial Intelligence NCAI, Riyadh, Saudi Arabia
来源
COMPUTER VISION - ECCV 2022, PT I | 2022年 / 13661卷
关键词
Single image depth estimation; Encoder-decoder architecture; Deep learning; Dense regression; Histogram prediction;
D O I
10.1007/978-3-031-19769-7_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel architecture for depth estimation from a single image. The architecture itself is based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks. We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways. First, instead of predicting global depth distributions, we predict depth distributions of local neighborhoods at every pixel. Second, instead of predicting depth distributions only towards the end of the decoder, we involve all layers of the decoder. We call this new architecture LocalBins. Our results demonstrate a clear improvement over the state-of-the-art in all metrics on the NYU-Depth V2 dataset. Code and pretrained models will be made publicly available (https://github.com/sharigfarooq123/LocalBins).
引用
收藏
页码:480 / 496
页数:17
相关论文
共 40 条
[1]   Self-Supervised Learning of Domain Invariant Features for Depth Estimation [J].
Akada, Hiroyasu ;
Bhat, Shariq Farooq ;
Alhashim, Ibraheem ;
Wonka, Peter .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :997-1007
[2]  
Alhashim I, 2019, Arxiv, DOI arXiv:1812.11941
[3]   Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer [J].
Atapour-Abarghouei, Amir ;
Breckon, Toby P. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2800-2810
[4]   AdaBins: Depth Estimation Using Adaptive Bins [J].
Bhat, Shariq Farooq ;
Alhashim, Ibraheem ;
Wonka, Peter .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4008-4017
[5]   Unsupervised monocular depth and ego-motion learning with structure and semantics [J].
Casser, Vincent ;
Pirk, Soeren ;
Mahjourian, Reza ;
Angelova, Anelia .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :381-388
[6]  
Chen XT, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P694
[7]   CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [J].
Chen, Yun-Chun ;
Lin, Yen-Yu ;
Yang, Ming-Hsuan ;
Huang, Jia-Bin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1791-1800
[8]  
Eigen D, 2014, ADV NEUR IN, V27
[9]   Deep Ordinal Regression Network for Monocular Depth Estimation [J].
Fu, Huan ;
Gong, Mingming ;
Wang, Chaohui ;
Batmanghelich, Kayhan ;
Tao, Dacheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2002-2011
[10]  
Godard C, 2019, Arxiv, DOI [arXiv:1806.01260, DOI 10.48550/ARXIV.1806.01260]