Deep Optics for Monocular Depth Estimation and 3D Object Detection

被引:155
作者
Chang, Julie [1 ]
Wetzstein, Gordon [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
SINGLE IMAGE; LAYOUT;
D O I
10.1109/ICCV.2019.01029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth estimation and 3D object detection are critical for scene understanding but remain challenging to perform with a single image due to the loss of 3D information during image capture. Recent models using deep neural networks have improved monocular depth estimation performance, but there is still difficulty in predicting absolute depth and generalizing outside a standard dataset. Here we introduce the paradigm of deep optics, i.e. end-to-end design of optics and image processing, to the monocular depth estimation problem, using coded defocus blur as an additional depth cue to be decoded by a neural network. We evaluate several optical coding strategies along with an end-to-end optimization scheme for depth estimation on three datasets, including NYU Depth v2 and KITTI. We find an optimized freeform lens design yields the best results, but chromatic aberration from a singlet lens offers significantly improved performance as well. We build a physical prototype and validate that chromatic aberrations improve depth estimation on real-world results. In addition, we train object detection networks on the KITTI dataset and show that the lens optimized for depth estimation also results in improved 3D object detection performance.
引用
收藏
页码:10192 / 10201
页数:10
相关论文
共 51 条
[1]  
[Anonymous], 2013, ACM Trans. Graph., DOI DOI 10.1145/2516971.2516974
[2]  
[Anonymous], 2015, P IEEE C COMPUTER VI, DOI 10.1109/CVPR.2015.7298801
[3]   Deep Depth from Defocus: How Can Defocus Blur Improve 3D Estimation Using Dense Neural Networks? [J].
Carvalho, Marcela ;
Le Saux, Bertrand ;
Trouve-Peloux, Pauline ;
Almansa, Andres ;
Champagnat, Frederic .
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 :307-323
[4]   Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification [J].
Chang, Julie ;
Sitzmann, Vincent ;
Dun, Xiong ;
Heidrich, Wolfgang ;
Wetzstein, Gordon .
SCIENTIFIC REPORTS, 2018, 8
[5]   Monocular 3D Object Detection for Autonomous Driving [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhang, Ziyu ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156
[6]  
Cossairt O., 2010, ACM Transactions on Graphics, V29, P1, DOI DOI 10.1109/ICCPHOT.2010.5585101
[7]  
Eigen D., 2014, ADV NEURAL INFORM PR, DOI DOI 10.5555/2969033.2969091
[8]   Research on Big Data Digging of Hot Topics about Recycled Water Use on Micro-Blog Based on Particle Swarm Optimization [J].
Fu, Hanliang ;
Li, Zhaoxing ;
Liu, Zhijian ;
Wang, Zelin .
SUSTAINABILITY, 2018, 10 (07)
[9]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[10]   Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Brostow, Gabriel J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611