Learning Single-Image Depth from Videos using Quality Assessment Networks

被引：9

作者：

Chen, Weifeng ^{[1
,2
]}

Qian, Shengyi ^{[1
]}

Deng, Jia ^{[2
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Princeton Univ, Princeton, NJ 08544 USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

基金：

美国国家科学基金会;

关键词：

VISION;

D O I：

10.1109/CVPR.2019.00575

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Depth estimation from a single image in the wild remains a challenging problem. One main obstacle is the lack of high-quality training data for images in the wild. In this paper we propose a method to automatically generate such data through Structure-from-Motion (SfM) on Internet videos. The core of this method is a Quality Assessment Network that identifies high-quality reconstructions obtained from SfM. Using this method, we collect single-view depth training data from a large number of YouTube videos and construct a new dataset called YouTube3D. Experiments show that YouTube3D is useful in training depth estimation networks and advances the state of the art of single-view depth estimation in the wild.

引用

页码：5587 / 5596

页数：10

共 54 条

[1] Building Rome in a Day
Agarwal, Sameer
Furukawa, Yasutaka
Snavely, Noah
Simon, Ian
Curless, Brian
Seitz, Steven M.
Szeliski, Richard
[J]. COMMUNICATIONS OF THE ACM, 2011, 54 (10) : 105 - 112
[2] [Anonymous], 2017, ARXIV170202706
[3] [Anonymous], 2016, ARXIV PREPRINT ARXIV
[4] [Anonymous], 2017, IEEE C COMP VIS PATT
[5] [Anonymous], 2016, CoRR
[6] [Anonymous], 2017, ARXIV170906158
[7] [Anonymous], 2017, ARXIV170204405
[8] [Anonymous], 2016, Semantic scene completion from a single depth image
[9] [Anonymous], 2017, ARXIV170805375
[10] [Anonymous], 2017, P CVPR HON HAW 21 26

← 1 2 3 4 5 6 →