Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

被引:207
作者
Hausler, Stephen [1 ]
Garg, Sourav [1 ]
Xu, Ming [1 ]
Milford, Michael [1 ]
Fischer, Tobias [1 ]
机构
[1] Queensland Univ Technol, QUT Ctr Robot, Brisbane, Qld, Australia
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.01392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. This paper introduces Patch-NetVLAD, which provides a novel formulation for combining the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals. Unlike the fixed spatial neighborhood regime of existing local keypoint features, our method enables aggregation and matching of deep-learned local features defined over the feature-space grid. We further introduce a multi-scale fusion of patch features that have complementary scales (i.e. patch sizes) via an integral feature space and show that the fused features are highly invariant to both condition (season, structure, and illumination) and viewpoint (translation and rotation) changes. Patch-NetVLAD achieves state-of-the-art visual place recognition results in computationally limited scenarios, validated on a range of challenging real-world datasets, including winning the Facebook Mapillary Visual Place Recognition Challenge at ECCV2020. It is also adaptable to user requirements, with a speed-optimised version operating over an order of magnitude faster than the state-of-the-art. By combining superior performance with improved computational efficiency in a configurable framework, Patch-NetVLAD is well suited to enhance both stand-alone place recognition capabilities and the overall performance of SLAM systems.
引用
收藏
页码:14136 / 14147
页数:12
相关论文
共 91 条
  • [1] [Anonymous], 2016, ICLR
  • [2] Anoosheh A, 2019, IEEE INT CONF ROBOT, P5958, DOI [10.1109/icra.2019.8794387, 10.1109/ICRA.2019.8794387]
  • [3] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
  • [4] All about VLAD
    Arandjelovic, Relja
    Zisserman, Andrew
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1578 - 1585
  • [5] Aggregating Deep Convolutional Features for Image Retrieval
    Babenko, Artem
    Lempitsky, Victor
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1269 - 1277
  • [6] Badino H, 2011, IEEE INT VEH SYM, P794, DOI 10.1109/IVS.2011.5940504
  • [7] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [8] Bingyi Cao, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12365), P726, DOI 10.1007/978-3-030-58565-5_43
  • [9] Camara L.G., 2019, 2019 European Conference on Mobile Robots (ECMR), P1
  • [10] Visual Place Recognition by spatial matching of high-level CNN features
    Camara, Luis G.
    Preucil, Libor
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 133