LocalBins: Improving Depth Estimation by Learning Local Distributions

被引：59

作者：

Bhat, Shariq Farooq ^{[1
]}

Alhashim, Ibraheem ^{[2
]}

Wonka, Peter ^{[1
]}

机构：

[1] KAUST, Thuwal, Saudi Arabia

[2] Saudi Data & Artificial Intelligence Authority SD, Natl Ctr Artificial Intelligence NCAI, Riyadh, Saudi Arabia

来源：

COMPUTER VISION - ECCV 2022, PT I | 2022年 / 13661卷

关键词：

Single image depth estimation; Encoder-decoder architecture; Deep learning; Dense regression; Histogram prediction;

D O I：

10.1007/978-3-031-19769-7_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a novel architecture for depth estimation from a single image. The architecture itself is based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks. We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways. First, instead of predicting global depth distributions, we predict depth distributions of local neighborhoods at every pixel. Second, instead of predicting depth distributions only towards the end of the decoder, we involve all layers of the decoder. We call this new architecture LocalBins. Our results demonstrate a clear improvement over the state-of-the-art in all metrics on the NYU-Depth V2 dataset. Code and pretrained models will be made publicly available (https://github.com/sharigfarooq123/LocalBins).

引用

页码：480 / 496

页数：17

共 40 条

[11] Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].

Godard, Clement ;

Mac Aodha, Oisin ;

Brostow, Gabriel J. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611

[12] Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras [J].

Gordon, Ariel ;

Li, Hanhan ;

Jonschkowski, Rico ;

Angelova, Anelia .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8976-8985

[13] Detail Preserving Depth Estimation from a Single Image Using Attention Guided Networks [J].

Hao, Zhixiang ;

Li, Yu ;

You, Shaodi ;

Lu, Feng .

2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, :304-313

[14]

He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]

[15] Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries [J].

Hu, Junjie ;

Ozay, Mete ;

Zhang, Yan ;

Okatani, Takayuki .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1043-1051

[16]

Huynh Lam., 2020, arXiv, DOI 10.1007/978-3-030-58574-735

[17]

Kim D., 2022, arXiv

[18] Evaluation of CNN-Based Single-Image Depth Estimation Methods [J].

Koch, Tobias ;

Liebel, Lukas ;

Fraundorfer, Friedrich ;

Koerner, Marco .

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 :331-348

[19] Deeper Depth Prediction with Fully Convolutional Residual Networks [J].

Laina, Iro ;

Rupprecht, Christian ;

Belagiannis, Vasileios ;

Tombari, Federico ;

Navab, Nassir .

PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, :239-248

[20]

Lee J.H., 2019, arXiv

← 1 2 3 4 →