Visual Localization via Few-Shot Scene Region Classification

被引:16
作者
Dong, Siyan [1 ,3 ]
Wang, Shuzhe [2 ,3 ]
Zhuang, Yixin [4 ]
Kannala, Juho [2 ]
Pollefeys, Marc [3 ,5 ]
Chen, Baoquan [6 ]
机构
[1] Shandong Univ, Jinan, Shandong, Peoples R China
[2] Aalto Univ, Espoo, Finland
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Fuzhou Univ, Fuzhou, Peoples R China
[5] Microsoft, Zurich, Switzerland
[6] Peking Univ, Beijing, Peoples R China
来源
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV | 2022年
基金
芬兰科学院;
关键词
D O I
10.1109/3DV57658.2022.00051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual (re)localization addresses the problem of estimating the 6-DoF (Degree of Freedom) camera pose of a query image captured in a known scene, which is a key building block of many computer vision and robotics applications. Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates with neural networks to build 2D-3D correspondences for camera pose optimization. However, such memorization requires training by amounts of posed images in each scene, which is heavy and inefficient. On the contrary, few-shot images are usually sufficient to cover the main regions of a scene for a human operator to perform visual localization. In this paper, we propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images. Our insight is leveraging a) pre-learned feature extractor, b) scene region classifier, and c) meta-learning strategy to accelerate training while mitigating overfitting. We evaluate our method on both indoor and outdoor benchmarks. The experiments validate the effectiveness of our method in the few-shot setting, and the training time is significantly reduced to only a few minutes.(1)
引用
收藏
页码:393 / 402
页数:10
相关论文
共 54 条
[1]  
Agarap A. F., 2018, arXiv
[2]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[3]  
Antoniou A, 2018, Arxiv, DOI arXiv:1711.04340
[4]  
Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
[5]   All about VLAD [J].
Arandjelovic, Relja ;
Zisserman, Andrew .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1578-1585
[6]  
Ba J. L., 2016, arXiv, DOI 10.48550/arXiv:1607.06450
[7]   Visual Camera Re-Localization From RGB and RGB-D Images Using DSAC [J].
Brachmann, Eric ;
Rother, Carsten .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) :5847-5865
[8]   Learning Less is More-6D Camera Localization via 3D Surface Regression [J].
Brachmann, Eric ;
Rother, Carsten .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4654-4662
[9]   DSAC - Differentiable RANSAC for Camera Localization [J].
Brachmann, Eric ;
Krull, Alexander ;
Nowozin, Sebastian ;
Shotton, Jamie ;
Michel, Frank ;
Gumhold, Stefan ;
Rother, Carsten .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2492-2500
[10]   Real-Time RGB-D Camera Pose Estimation in Novel Scenes Using a Relocalisation Cascade [J].
Cavallari, Tommaso ;
Golodetz, Stuart ;
Lord, Nicholas A. ;
Valentin, Julien ;
Prisacariu, Victor A. ;
Di Stefano, Luigi ;
Torr, Philip H. S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) :2465-2477