SAM-Net: Self-Attention based Feature Matching with Spatial Transformers and Knowledge Distillation

被引：4

作者：

Kelenyi, Benjamin ^{[1
]}

Domsa, Victor ^{[1
]}

Tamas, Levente ^{[1
]}

机构：

[1] Tech Univ Cluj Napoca, Memorandumului 28, Cluj Napoca 400114, Romania

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 242卷

关键词：

Geometric features extraction; Self-attention; Knowledge-distillation; Spatial transformers; Pose estimation;

D O I：

10.1016/j.eswa.2023.122804

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this research paper, we introduce a novel approach to enhance the performance of 2D feature matching and pose estimation through the integration of a hierarchical attention mechanism and knowledge distillation. Our proposed hierarchical attention mechanism operates at multiple scales, enabling both global context awareness and precise matching of 2D features, which is crucial for various computer vision tasks. To further improve our model's performance, we incorporate insights from an existing model PixLoc (Sarlin et al., 2021) through knowledge distillation, effectively acquiring its behavior and capabilities by ignoring dynamic objects. SAM-Net outperforms state-of-the-art methods, validated on both indoor and outdoor public datasets. For the indoor dataset, our approach achieves remarkable AUC (5 degrees /10 degrees /20 degrees) scores of 55.31/71.70/83.37. Similarly, for the outdoor dataset, we demonstrate outstanding AUC values of 26.01/46.44/63.61. Furthermore, SAM-Net achieves top ranking among published methods in two public visual localization benchmarks, highlighting the real benefits of the proposed method. The code and test suite can be accessed at link.1

引用

页数：12

共 61 条

[1] Alahi A, 2012, PROC CVPR IEEE, P510, DOI 10.1109/CVPR.2012.6247715
[2] SURF: Speeded up robust features
Bay, Herbert
Tuytelaars, Tinne
Van Gool, Luc
[J]. COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 : 404 - 417
[3] Augmented reality integration into MES for connected workers
Blaga, Andreea
Militaru, Cristian
Mezei, Ady-Daniel
Tamas, Levente
[J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2021, 68
[4] HTMatch: An efficient hybrid transformer based graph neural network for local feature matching
Cai, Youcheng
Li, Lin
Wang, Dong
Li, Xinjie
Liu, Xiaoping
[J]. SIGNAL PROCESSING, 2023, 204
[5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6] ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Chen, Hongkai
Luo, Zixin
Zhou, Lei
Tian, Yurun
Zhen, Mingmin
Fang, Tian
McKinnon, David
Tsin, Yanghai
Quan, Long
[J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 20 - 36
[7] Learning to Match Features with Seeded Graph Matching Network
Chen, Hongkai
Luo, Zixin
Zhang, Jiahui
Zhou, Lei
Bai, Xuyang
Hu, Zeyu
Tai, Chiew-Lan
Quan, Long
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6281 - 6290
[8] Chowdhary KR., 2020, Fundamentals of Artificial Intelligence, P603, DOI [DOI 10.1007/978-81-322-3972-719, 10.1007/978-81-322-3972-7_19, DOI 10.1007/978-81-322-3972-7_19]
[9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Dai, Angela
Chang, Angel X.
Savva, Manolis
Halber, Maciej
Funkhouser, Thomas
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
[10] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
Dai, Zhigang
Cai, Bolun
Lin, Yugeng
Chen, Junying
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1601 - 1610

← 1 2 3 4 5 6 7 →