Depth Estimation of Video Sequences With Perceptual Losses

被引:20
作者
Wang, Anjie [1 ]
Fang, Zhijun [1 ]
Gao, Yongbin [1 ]
Jiang, Xiaoyan [1 ]
Ma, Siwei [2 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] Peking Univ, Dept Elect Engn & Comp Sci, Beijing 100871, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
基金
中国国家自然科学基金;
关键词
Depth estimation; perceptual losses; unsupervised; deep learning;
D O I
10.1109/ACCESS.2018.2846546
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3-D vision plays an important role in intelligent perception of robot, while it requires extra 3-D sensors. Depth estimation from monocular videos provides an alternative mechanism to recover the 3-D information. In this paper, we propose an unsupervised learning framework that uses the perceptual loss for depth estimation. Depth and pose networks are first trained to estimate the depth and the camera motion of the video sequence, respectively. With the estimated depth and pose of the original frame, the adjacent frame can be reconstructed. The pixel-wise differences between the constructed frame and the original frame are used as per-pixel loss. Meanwhile, reconstructed views and original views can be used to extract advanced features from a pre-trained network to define and optimize perceptual loss functions to assess the quality of reconstructions. We combine the respective advantages of these two methods and present an approach of generating a depth map by training the feed-forward network with per-pixel loss function and perceptual loss function. The experimental results show that our method can significantly improve the estimation accuracy of depth map.
引用
收藏
页码:30536 / 30546
页数:11
相关论文
共 37 条
  • [21] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [22] Semi-Supervised Deep Learning for Monocular Depth Map Prediction
    Kuznietsov, Yevhen
    Stuckle, Jorg
    Leibe, Bastian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2215 - 2223
  • [23] Pulling Things out of Perspective
    Ladicky, L'ubor
    Shi, Jianbo
    Pollefeys, Marc
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 89 - 96
  • [24] Deeper Depth Prediction with Fully Convolutional Residual Networks
    Laina, Iro
    Rupprecht, Christian
    Belagiannis, Vasileios
    Tombari, Federico
    Navab, Nassir
    [J]. PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 239 - 248
  • [25] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
    Ledig, Christian
    Theis, Lucas
    Huszar, Ferenc
    Caballero, Jose
    Cunningham, Andrew
    Acosta, Alejandro
    Aitken, Andrew
    Tejani, Alykhan
    Totz, Johannes
    Wang, Zehan
    Shi, Wenzhe
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 105 - 114
  • [26] A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images
    Li, Jun
    Klein, Reinhard
    Yao, Angela
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3392 - 3400
  • [27] Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields
    Liu, Fayao
    Shen, Chunhua
    Lin, Guosheng
    Reid, Ian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2024 - 2039
  • [28] Discrete-Continuous Depth Estimation from a Single Image
    Liu, Miaomiao
    Salzmann, Mathieu
    He, Xuming
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 716 - 723
  • [29] Mahendran A, 2015, PROC CVPR IEEE, P5188, DOI 10.1109/CVPR.2015.7299155
  • [30] A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
    Mayer, Nikolaus
    Ilg, Eddy
    Hausser, Philip
    Fischer, Philipp
    Cremers, Daniel
    Dosovitskiy, Alexey
    Brox, Thomas
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4040 - 4048