SAM-Net: Self-Attention based Feature Matching with Spatial Transformers and Knowledge Distillation

被引:4
作者
Kelenyi, Benjamin [1 ]
Domsa, Victor [1 ]
Tamas, Levente [1 ]
机构
[1] Tech Univ Cluj Napoca, Memorandumului 28, Cluj Napoca 400114, Romania
关键词
Geometric features extraction; Self-attention; Knowledge-distillation; Spatial transformers; Pose estimation;
D O I
10.1016/j.eswa.2023.122804
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this research paper, we introduce a novel approach to enhance the performance of 2D feature matching and pose estimation through the integration of a hierarchical attention mechanism and knowledge distillation. Our proposed hierarchical attention mechanism operates at multiple scales, enabling both global context awareness and precise matching of 2D features, which is crucial for various computer vision tasks. To further improve our model's performance, we incorporate insights from an existing model PixLoc (Sarlin et al., 2021) through knowledge distillation, effectively acquiring its behavior and capabilities by ignoring dynamic objects. SAM-Net outperforms state-of-the-art methods, validated on both indoor and outdoor public datasets. For the indoor dataset, our approach achieves remarkable AUC (5 degrees /10 degrees /20 degrees) scores of 55.31/71.70/83.37. Similarly, for the outdoor dataset, we demonstrate outstanding AUC values of 26.01/46.44/63.61. Furthermore, SAM-Net achieves top ranking among published methods in two public visual localization benchmarks, highlighting the real benefits of the proposed method. The code and test suite can be accessed at link.1
引用
收藏
页数:12
相关论文
共 61 条
  • [1] Alahi A, 2012, PROC CVPR IEEE, P510, DOI 10.1109/CVPR.2012.6247715
  • [2] SURF: Speeded up robust features
    Bay, Herbert
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 : 404 - 417
  • [3] Augmented reality integration into MES for connected workers
    Blaga, Andreea
    Militaru, Cristian
    Mezei, Ady-Daniel
    Tamas, Levente
    [J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2021, 68
  • [4] HTMatch: An efficient hybrid transformer based graph neural network for local feature matching
    Cai, Youcheng
    Li, Lin
    Wang, Dong
    Li, Xinjie
    Liu, Xiaoping
    [J]. SIGNAL PROCESSING, 2023, 204
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
    Chen, Hongkai
    Luo, Zixin
    Zhou, Lei
    Tian, Yurun
    Zhen, Mingmin
    Fang, Tian
    McKinnon, David
    Tsin, Yanghai
    Quan, Long
    [J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 20 - 36
  • [7] Learning to Match Features with Seeded Graph Matching Network
    Chen, Hongkai
    Luo, Zixin
    Zhang, Jiahui
    Zhou, Lei
    Bai, Xuyang
    Hu, Zeyu
    Tai, Chiew-Lan
    Quan, Long
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6281 - 6290
  • [8] Chowdhary KR., 2020, Fundamentals of Artificial Intelligence, P603, DOI [DOI 10.1007/978-81-322-3972-719, 10.1007/978-81-322-3972-7_19, DOI 10.1007/978-81-322-3972-7_19]
  • [9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
    Dai, Angela
    Chang, Angel X.
    Savva, Manolis
    Halber, Maciej
    Funkhouser, Thomas
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
  • [10] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1601 - 1610