Photon-Efficient 3D Reconstruction with A Coarse-to-Fine Neural Network

被引:2
作者
Guo, Shangwei [1 ]
Lai, Zhengchao [1 ]
Li, Jun [1 ]
Han, Shaokun [1 ]
机构
[1] Beijing Inst Technol, Sch Opt & Photon, Beijing Key Lab Precis Optoelect Measurement Inst, Beijing 100081, Peoples R China
关键词
LiDAR; Single-Photon LiDAR; Photon-Efficient 3D Reconstruction; Neural Network;
D O I
10.1016/j.optlaseng.2022.107224
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
3D reconstruction from the sparse and noisy photon-efficient measurement Time-of-Arrival (ToA) cube is chal-lenging because the effective echo signal occupies only a small part of the time channel and the rest of the time channel contains only noise. All learning-based photon-efficient 3D reconstruction methods extract fea-tures over the entire time channel of ToA. However, extracting features from the entire time channel causes the learned features to be mainly defined by noise accounting for most of the time channel, thereby reducing the 3D reconstruction accuracy. In this paper, we propose a Coarse-to-Fine neural network, where the coarse part eliminates the invalid noisy time bins and the fine part extracts features on the remaining time bins containing only effective echo signals. Specifically, due to locating the interval to which the effective echo signal belongs, non-local spatial-temporal features of the ToA cube must be captured. To this end, we propose a transformer -based Coarse-Interval-Localization-Network (CILN), which holds the global receptive field to aggregate features from long-distance time bins. Then, the located interval containing only the effective echo signal is cropped from the ToA cube and input to the proposed Fine-Maximum-Localization-Network (FMLN) to locate the maximum of the echo signal. Because the cropping operation destroys the distribution of the original signal, we propose the position encoding module to transmit the distribution change information to high dimensional feature space in the FMLN. Furthermore, we propose the temporal attention module to guide the FMLN to pay more attention to the useful signal. Compared with other methods that extract features over the entire time channel, the coarse-to -fine configuration of our method eliminates the time bins containing invalid noise through the coarse part and reduces the influence of the noise on the feature extraction of the fine part, and therefore the performance of the network on the reconstruction accuracy is improved. We conduct multiple experiments on the simulated data and real-world data, and the experimental results show that the proposed Coarse-to-Fine neural network can achieve state-of-the-art performance.
引用
收藏
页数:12
相关论文
共 30 条
[1]   Enhanced Depth Navigation Through Augmented Reality Depth Mapping in Patients with Low Vision [J].
Angelopoulos, Anastasios Nikolas ;
Ameri, Hossein ;
Mitra, Debbie ;
Humayun, Mark .
SCIENTIFIC REPORTS, 2019, 9 (1) :11230
[2]  
[Anonymous], 2016, Robotic Science and Systems
[3]   COMMUNICATION UNDER POISSON REGIME [J].
BARDAVID, I .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1969, 15 (1P1) :31-+
[4]   SPAD-based flash LiDAR sensor with high ambient light rejection for automotive applications [J].
Beer, Maik ;
Schrey, Olaf M. ;
Haase, Jan F. ;
Ruskowski, Jennifer ;
Brockherde, Werner ;
Hosticka, Bedrich J. ;
Kokozinski, Rainer .
QUANTUM SENSING AND NANO ELECTRONICS AND PHOTONICS XV, 2018, 10540
[5]   Long-range depth imaging using a single-photon detector array and non-local data fusion [J].
Chan, Susan ;
Halimi, Abderrahim ;
Zhu, Feng ;
Gyongy, Istvan ;
Henderson, Robert K. ;
Bowman, Richard ;
McLaughlin, Stephen ;
Buller, Gerald S. ;
Leach, Jonathan .
SCIENTIFIC REPORTS, 2019, 9 (1)
[6]   Learning Non-Local Spatial Correlations To Restore Sparse 3D Single-Photon Data [J].
Chen, Songmao ;
Halimi, Abderrahim ;
Ren, Ximing ;
McCarthy, Aongus ;
Su, Xiuqin ;
McLaughlin, Stephen ;
Buller, Gerald S. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3119-3131
[7]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[8]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments [J].
Henry, Peter ;
Krainin, Michael ;
Herbst, Evan ;
Ren, Xiaofeng ;
Fox, Dieter .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2012, 31 (05) :647-663