Visual Localization via Few-Shot Scene Region Classification

被引：16

作者：

Dong, Siyan ^{[1
,3
]}

Wang, Shuzhe ^{[2
,3
]}

Zhuang, Yixin ^{[4
]}

Kannala, Juho ^{[2
]}

Pollefeys, Marc ^{[3
,5
]}

Chen, Baoquan ^{[6
]}

机构：

[1] Shandong Univ, Jinan, Shandong, Peoples R China

[2] Aalto Univ, Espoo, Finland

[3] Swiss Fed Inst Technol, Zurich, Switzerland

[4] Fuzhou Univ, Fuzhou, Peoples R China

[5] Microsoft, Zurich, Switzerland

[6] Peking Univ, Beijing, Peoples R China

来源：

2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV | 2022年

基金：

芬兰科学院;

关键词：

D O I：

10.1109/3DV57658.2022.00051

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual (re)localization addresses the problem of estimating the 6-DoF (Degree of Freedom) camera pose of a query image captured in a known scene, which is a key building block of many computer vision and robotics applications. Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates with neural networks to build 2D-3D correspondences for camera pose optimization. However, such memorization requires training by amounts of posed images in each scene, which is heavy and inefficient. On the contrary, few-shot images are usually sufficient to cover the main regions of a scene for a human operator to perform visual localization. In this paper, we propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images. Our insight is leveraging a) pre-learned feature extractor, b) scene region classifier, and c) meta-learning strategy to accelerate training while mitigating overfitting. We evaluate our method on both indoor and outdoor benchmarks. The experiments validate the effectiveness of our method in the few-shot setting, and the training time is significantly reduced to only a few minutes.(1)

引用

页码：393 / 402

页数：10

共 54 条

[1]

Agarap A. F., 2018, arXiv

[2]

Andrychowicz M, 2016, ADV NEUR IN, V29

[3]

Antoniou A, 2018, Arxiv, DOI arXiv:1711.04340

[4]

Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]

[5] All about VLAD [J].

Arandjelovic, Relja ;

Zisserman, Andrew .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1578-1585

[6]

Ba J. L., 2016, arXiv, DOI 10.48550/arXiv:1607.06450

[7] Visual Camera Re-Localization From RGB and RGB-D Images Using DSAC [J].

Brachmann, Eric ;

Rother, Carsten .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) :5847-5865

[8] Learning Less is More-6D Camera Localization via 3D Surface Regression [J].

Brachmann, Eric ;

Rother, Carsten .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4654-4662

[9] DSAC - Differentiable RANSAC for Camera Localization [J].

Brachmann, Eric ;

Krull, Alexander ;

Nowozin, Sebastian ;

Shotton, Jamie ;

Michel, Frank ;

Gumhold, Stefan ;

Rother, Carsten .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2492-2500

[10] Real-Time RGB-D Camera Pose Estimation in Novel Scenes Using a Relocalisation Cascade [J].

Cavallari, Tommaso ;

Golodetz, Stuart ;

Lord, Nicholas A. ;

Valentin, Julien ;

Prisacariu, Victor A. ;

Di Stefano, Luigi ;

Torr, Philip H. S. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) :2465-2477

← 1 2 3 4 5 6 →