Unsupervised High-Resolution Depth Learning From Videos With Dual Networks

被引:45
作者
Zhou, Junsheng [1 ]
Wang, Yuwang [2 ]
Qin, Kaihuai [1 ]
Zeng, Wenjun [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00697
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised depth learning takes the appearance difference between a target view and a view synthesized from its adjacent frame as supervisory signal. Since the supervisory signal only comes from images themselves, the resolution of training data significantly impacts the performance. High-resolution images contain more fine-grained details and provide more accurate supervisory signal. However, due to the limitation of memory and computation power, the original images are typically down-sampled during training, which suffers heavy loss of details and disparity accuracy. In order to fully explore the information contained in high-resolution data, we propose a simple yet effective dual networks architecture, which can directly take high-resolution images as input and generate high-resolution and high-accuracy depth map efficiently. We also propose a Self-assembled Attention (SA-Attention) module to handle low-texture region. The evaluation on the benchmark KITTI and Make3D datasets demonstrates that our method achieves state-of-the-art results in the monocular depth estimation task.
引用
收藏
页码:6871 / 6880
页数:10
相关论文
共 56 条
  • [1] Abadi Martin, 2016, arXiv
  • [2] [Anonymous], 2018, ARXIV180301599
  • [3] [Anonymous], 2012, ACTIVE COMPUTER VISI
  • [4] Atapour-Abarghouei Amir, 2018, IEEE CVF C COMP VIS
  • [5] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [6] Casser V, 2019, AAAI CONF ARTIF INTE, P8001
  • [7] Learning a Deep Convolutional Network for Image Super-Resolution
    Dong, Chao
    Loy, Chen Change
    He, Kaiming
    Tang, Xiaoou
    [J]. COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 184 - 199
  • [8] Eigen D., 2014, ADV NEURAL INFORM PR, DOI DOI 10.5555/2969033.2969091
  • [9] Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture
    Eigen, David
    Fergus, Rob
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2650 - 2658
  • [10] Deep Ordinal Regression Network for Monocular Depth Estimation
    Fu, Huan
    Gong, Mingming
    Wang, Chaohui
    Batmanghelich, Kayhan
    Tao, Dacheng
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011