Learning Single-Image Depth from Videos using Quality Assessment Networks

被引:9
作者
Chen, Weifeng [1 ,2 ]
Qian, Shengyi [1 ]
Deng, Jia [2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
基金
美国国家科学基金会;
关键词
VISION;
D O I
10.1109/CVPR.2019.00575
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth estimation from a single image in the wild remains a challenging problem. One main obstacle is the lack of high-quality training data for images in the wild. In this paper we propose a method to automatically generate such data through Structure-from-Motion (SfM) on Internet videos. The core of this method is a Quality Assessment Network that identifies high-quality reconstructions obtained from SfM. Using this method, we collect single-view depth training data from a large number of YouTube videos and construct a new dataset called YouTube3D. Experiments show that YouTube3D is useful in training depth estimation networks and advances the state of the art of single-view depth estimation in the wild.
引用
收藏
页码:5587 / 5596
页数:10
相关论文
共 54 条
  • [1] Building Rome in a Day
    Agarwal, Sameer
    Furukawa, Yasutaka
    Snavely, Noah
    Simon, Ian
    Curless, Brian
    Seitz, Steven M.
    Szeliski, Richard
    [J]. COMMUNICATIONS OF THE ACM, 2011, 54 (10) : 105 - 112
  • [2] [Anonymous], 2017, ARXIV170202706
  • [3] [Anonymous], 2016, ARXIV PREPRINT ARXIV
  • [4] [Anonymous], 2017, IEEE C COMP VIS PATT
  • [5] [Anonymous], 2016, CoRR
  • [6] [Anonymous], 2017, ARXIV170906158
  • [7] [Anonymous], 2017, ARXIV170204405
  • [8] [Anonymous], 2016, Semantic scene completion from a single depth image
  • [9] [Anonymous], 2017, ARXIV170805375
  • [10] [Anonymous], 2017, P CVPR HON HAW 21 26