Accurate Deep Direct Geo-Localization from Ground Imagery and Phone-Grade GPS

被引:2
作者
Sun, Shaohui [1 ]
Sarukkai, Ramesh [1 ]
Kwok, Jack [1 ]
Shet, Vinay [1 ]
机构
[1] Lyft Inc, Engn Ctr Level5, Palo Alto, CA 94304 USA
来源
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) | 2018年
关键词
D O I
10.1109/CVPRW.2018.00148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most critical topics in autonomous driving or ride-sharing technology is to accurately localize vehicles in the world frame. In addition to common multiview camera systems, it usually also relies on industrial grade sensors, such as LiDAR, differential GPS, high precision IMU, and etc. In this paper, we develop an approach to provide an effective solution to this problem. We propose a method to train a geo-spatial deep neural network (CNN+LSTM) to predict accurate geo-locations (latitude and longitude) using only ordinary ground imagery and low accuracy phone-grade GPS. We evaluate our approach on the open dataset released during ACM Multimedia 2017 Grand Challenge. Having ground truth locations for training, we are able to reach nearly lane-level accuracy. We also evaluate the proposed method on our own collected images in San Francisco downtown area often described as "downtown canyon" where consumer GPS signals are extremely inaccurate. The results show the model can predict quality locations that suffice in real business applications, such as ride-sharing, only using phone-grade GPS. Unlike classic visual localization or recent PoseNet-like methods that may work well in indoor environments or small-scale outdoor environments, we avoid using a map or an SFM (structure-from-motion) model at all. More importantly, the proposed method can be scaled up without concerns over the potential failure of 3D reconstruction.
引用
收藏
页码:1129 / 1136
页数:8
相关论文
共 17 条
[1]  
[Anonymous], 2016, ARXIV161107890
[2]  
[Anonymous], IEEE C COMP VIS PATT
[3]   Learning and calibrating per-location classifiers for visual place recognition [J].
Gronat, Petr ;
Obozinski, Guillaume ;
Sivic, Josef ;
Pajdla, Tomas .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :907-914
[4]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[5]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
[6]  
Kendall A, 2016, IEEE INT CONF ROBOT, P4762, DOI 10.1109/ICRA.2016.7487679
[7]   PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization [J].
Kendall, Alex ;
Grimes, Matthew ;
Cipolla, Roberto .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2938-2946
[8]  
Li YP, 2010, LECT NOTES COMPUT SC, V6312, P791
[9]  
Lynen S, 2015, ROBOTICS: SCIENCE AND SYSTEMS XI
[10]   ORB-SLAM: A Versatile and Accurate Monocular SLAM System [J].
Mur-Artal, Raul ;
Montiel, J. M. M. ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2015, 31 (05) :1147-1163