Unsupervised High-Resolution Depth Learning From Videos With Dual Networks

被引：45

作者：

Zhou, Junsheng ^{[1
]}

Wang, Yuwang ^{[2
]}

Qin, Kaihuai ^{[1
]}

Zeng, Wenjun ^{[2
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Microsoft Res, Beijing, Peoples R China

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年

关键词：

D O I：

10.1109/ICCV.2019.00697

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unsupervised depth learning takes the appearance difference between a target view and a view synthesized from its adjacent frame as supervisory signal. Since the supervisory signal only comes from images themselves, the resolution of training data significantly impacts the performance. High-resolution images contain more fine-grained details and provide more accurate supervisory signal. However, due to the limitation of memory and computation power, the original images are typically down-sampled during training, which suffers heavy loss of details and disparity accuracy. In order to fully explore the information contained in high-resolution data, we propose a simple yet effective dual networks architecture, which can directly take high-resolution images as input and generate high-resolution and high-accuracy depth map efficiently. We also propose a Self-assembled Attention (SA-Attention) module to handle low-texture region. The evaluation on the benchmark KITTI and Make3D datasets demonstrates that our method achieves state-of-the-art results in the monocular depth estimation task.

引用

页码：6871 / 6880

页数：10

共 56 条

[1] Abadi Martin, 2016, arXiv
[2] [Anonymous], 2018, ARXIV180301599
[3] [Anonymous], 2012, ACTIVE COMPUTER VISI
[4] Atapour-Abarghouei Amir, 2018, IEEE CVF C COMP VIS
[5] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[6] Casser V, 2019, AAAI CONF ARTIF INTE, P8001
[7] Learning a Deep Convolutional Network for Image Super-Resolution
Dong, Chao
Loy, Chen Change
He, Kaiming
Tang, Xiaoou
[J]. COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 184 - 199
[8] Eigen D., 2014, ADV NEURAL INFORM PR, DOI DOI 10.5555/2969033.2969091
[9] Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture
Eigen, David
Fergus, Rob
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2650 - 2658
[10] Deep Ordinal Regression Network for Monocular Depth Estimation
Fu, Huan
Gong, Mingming
Wang, Chaohui
Batmanghelich, Kayhan
Tao, Dacheng
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011

← 1 2 3 4 5 6 →