MixVPR: Feature Mixing for Visual Place Recognition

被引:109
作者
Ali-bey, Amar [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Univ Laval, Quebec City, PQ, Canada
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
关键词
MODEL;
D O I
10.1109/WACV56688.2023.00301
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Place Recognition (VPR) is a crucial part of mobile robotics and autonomous driving as well as other computer vision tasks. It refers to the process of identifying a place depicted in a query image using only computer vision. At large scale, repetitive structures, weather and illumination changes pose a real challenge, as appearances can drastically change over time. Along with tackling these challenges, an efficient VPR technique must also be practical in real-world scenarios where latency matters. To address this, we introduce MixVPR, a new holistic feature aggregation technique that takes feature maps from pre-trained backbones as a set of global features. Then, it incorporates a global relationship between elements in each feature map in a cascade of feature mixing, eliminating the need for local or pyramidal aggregation as done in NetVLAD or TransVPR. We demonstrate the effectiveness of our technique through extensive experiments on multiple large-scale benchmarks. Our method outperforms all existing techniques by a large margin while having less than half the number of parameters compared to CosPlace and NetVLAD. We achieve a new all-time high recall@1 score of 94.6% on Pitts250k-test, 88.0% on MapillarySLS, and more importantly, 58.4% on Nordland. Finally, our method outperforms two-stage retrieval techniques such as Patch-NetVLAD, TransVPR and SuperGLUE all while being orders of magnitude faster.
引用
收藏
页码:2997 / 3006
页数:10
相关论文
共 52 条
[1]   GSV-CITIES: Toward appropriate supervised visual place recognition [J].
Ali-bey, Amar ;
Chaib-draa, Brahim ;
Giguere, Philippe .
NEUROCOMPUTING, 2022, 513 :194-203
[2]  
Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/CVPR.2016.572, 10.1109/TPAMI.2017.2711011]
[3]   All about VLAD [J].
Arandjelovic, Relja ;
Zisserman, Andrew .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1578-1585
[4]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[5]   Rethinking Visual Geo-localization for Large-Scale Applications [J].
Berton, Gabriele ;
Masone, Carlo ;
Caputo, Barbara .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4868-4878
[6]  
Boateng P, 2017, MEGAPROJECT RISK ANALYSIS AND SIMULATION: A DYNAMIC SYSTEMS APPROACH, P223
[7]   Unifying Deep Local and Global Features for Image Search [J].
Cao, Bingyi ;
Araujo, Andre ;
Sim, Jack .
COMPUTER VISION - ECCV 2020, PT XX, 2020, 12365 :726-743
[8]  
Chen Wei, 2021, ARXIV210111282
[9]  
Chen ZT, 2017, IEEE INT C INT ROBOT, P9, DOI 10.1109/IROS.2017.8202131
[10]   Learning Context Flexible Attention Model for Long-Term Visual Place Recognition [J].
Chen, Zetao ;
Liu, Lingqiao ;
Sa, Inkyu ;
Ge, Zongyuan ;
Chli, Margarita .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :4015-4022