I2D-Loc: Camera localization via image to LiDAR depth flow

被引:14
作者
Chen, Kuangyi [1 ]
Yu, Huai [1 ]
Yang, Wen [1 ]
Yu, Lei [1 ]
Scherer, Sebastian [2 ]
Xia, Gui-Song [3 ]
机构
[1] Wuhan Univ, Sch Elect Informat, Wuhan 430072, Peoples R China
[2] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Camera localization; 2D-3D registration; Flow estimation; Depth completion; Neural network; LINE; POSE;
D O I
10.1016/j.isprsjprs.2022.10.009
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Accurate camera localization in existing LiDAR maps is promising since it potentially allows exploiting strengths of both LiDAR-based and camera-based methods. However, effective methods that robustly address appearance and modality differences for 2D-3D localization are still missing. To overcome these problems, we propose the I2D-Loc, a scene-agnostic and end-to-end trainable neural network that estimates the 6-DoF pose from an RGB image to an existing LiDAR map with local optimization on an initial pose. Specifically, we first project the LiDAR map to the image plane according to a rough initial pose and utilize a depth completion algorithm to generate a dense depth image. We further design a confidence map to weight the features extracted from the dense depth to get a more reliable depth representation. Then, we propose to utilize a neural network to estimate the correspondence flow between depth and RGB images. Finally, we utilize the BPnP algorithm to estimate the 6-DoF pose, calculating the gradients of pose error and optimizing the front-end network parameters. Moreover, by decoupling the intrinsic camera parameters out of the end-to-end training process, I2D-Loc can be generalized to the images with different intrinsic parameters. Experiments on KITTI, Argoverse, and Lyft5 datasets demonstrate that the I2D-Loc can achieve centimeter localization performance. The source code, dataset, trained models, and demo videos are released at https://levenberg.github.io/I2D-Loc/.
引用
收藏
页码:209 / 221
页数:13
相关论文
共 54 条
[31]   EPnP: An Accurate O(n) Solution to the PnP Problem [J].
Lepetit, Vincent ;
Moreno-Noguer, Francesc ;
Fua, Pascal .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 81 (02) :155-166
[32]   ULSD: Unified line segment detection across pinhole, fisheye, and spherical cameras [J].
Li, Hao ;
Yu, Huai ;
Wang, Jinwang ;
Yang, Wen ;
Yu, Lei ;
Scherer, Sebastian .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 178 :187-202
[33]  
Li J., 2021, P IEEE CVF C COMP VI, P15960
[34]   USIP: Unsupervised Stable Interest Point Detection from 3D Point Clouds [J].
Li, Jiaxin ;
Lee, Gim Hee .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :361-370
[35]   PoseGAN: A pose-to-image translation framework for camera localization [J].
Liu, Kanglin ;
Li, Qing ;
Qiu, Guoping .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 166 :308-315
[36]   DETERMINATION OF CAMERA LOCATION FROM 2-D TO 3-D LINE AND POINT CORRESPONDENCES [J].
LIU, YC ;
HUANG, TS ;
FAUGERAS, OD .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (01) :28-37
[37]  
Lowe D. G., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P1150, DOI 10.1109/ICCV.1999.790410
[38]  
Moreno-Noguer F, 2008, LECT NOTES COMPUT SC, V5303, P405, DOI 10.1007/978-3-540-88688-4_30
[39]   ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras [J].
Mur-Artal, Raul ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (05) :1255-1262
[40]  
Pintus R., 2011, P 12 INT C VIRTUAL R, P105