Transformer With Linear-Window Attention for Feature Matching

被引:0
|
作者
Shen, Zhiwei [1 ,2 ]
Kong, Bin [1 ,3 ,4 ]
Dong, Xiaoyu [1 ,2 ]
机构
[1] Chinese Acad Sci, Hefei Inst Intelligent Machines, Hefei 230031, Peoples R China
[2] Univ Sci & Technol China, Hefei Inst Phys Sci, Hefei 230026, Peoples R China
[3] Anhui Engn Lab Intelligent Driving Technol & Appli, Hefei 230088, Peoples R China
[4] Chinese Acad Sci, Innovat Res Inst Robot & Intelligent Mfg Hefei, Hefei 230088, Peoples R China
关键词
Feature extraction; Transformers; Task analysis; Computational modeling; Computational efficiency; Memory management; Visualization; Feature matching; visual transformer; detector-free; computational complexity; low-texture;
D O I
10.1109/ACCESS.2023.3328855
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A transformer can capture long-term dependencies through an attention mechanism, and hence, can be applied to various vision tasks. However, its secondary computational complexity is a major obstacle in vision tasks that require accurate predictions. To address this limitation, this study introduces linear-window attention (LWA), a new attention model for a vision transformer. The transformer computes self-attention that is restricted to nonoverlapping local windows and represented as a linear dot product of kernel feature mappings. Furthermore, the computational complexity of each window is reduced to linear from quadratic using the constraint property of matrix products. In addition, we applied the LWA to feature matching to construct a coarse-to-fine-level detector-free feature matching method, called transformer with linear-window attention for feature matching TRLWAM. At the coarse level, we extracted the dense pixel-level matches, and at the fine level, we obtained the final matching results via multi-head multilayer perceptron refinement. We demonstrated the effectiveness of LWA through Replace experiments. The results showed that the TRLWAM could extract dense matches from low-texture or repetitive pattern regions in indoor environments, and exhibited excellent results with a low computational cost for MegaDepth and HPatches datasets. We believe the proposed LWA can provide new conceptions for transformer applications in visual tasks.
引用
收藏
页码:121202 / 121211
页数:10
相关论文
共 50 条
  • [11] Guided Local Feature Matching with Transformer
    Du, Siliang
    Xiao, Yilin
    Huang, Jingwei
    Sun, Mingwei
    Liu, Mingzhong
    REMOTE SENSING, 2023, 15 (16)
  • [12] LGFCTR: Local and Global Feature Convolutional Transformer for Image Matching
    Zhong, Wenhao
    Jiang, Jie
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 270
  • [13] AMatFormer: Efficient Feature Matching via Anchor Matching Transformer
    Jiang, Bo
    Luo, Shuxian
    Wang, Xiao
    Li, Chuanfu
    Tang, Jin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1504 - 1515
  • [14] Swin-transformer for weak feature matching
    Guo, Yuan
    Li, Wenpeng
    Zhai, Ping
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [15] Exploring Attention Sparsity to Accelerate Transformer Training on GPUs
    Yoon, Bokyeong
    Lee, Ah-Hyun
    Kim, Jinsung
    Moon, Gordon Euhyun
    IEEE ACCESS, 2024, 12 : 131373 - 131384
  • [16] MatchFormer: Interleaving Attention in Transformers for Feature Matching
    Wang, Qing
    Zhang, Jiaming
    Yang, Kailun
    Peng, Kunyu
    Stiefelhagen, Rainer
    COMPUTER VISION - ACCV 2022, PT III, 2023, 13843 : 256 - 273
  • [17] Explainability Enhanced Object Detection Transformer With Feature Disentanglement
    Yu, Wenlong
    Liu, Ruonan
    Chen, Dongyue
    Hu, Qinghua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6439 - 6454
  • [18] Remote Sensing Image Change Detection Transformer Network Based on Dual-Feature Mixed Attention
    Song, Xinyang
    Hua, Zhen
    Li, Jinjiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [19] Robust Feature Matching for Remote Sensing Image Registration via Linear Adaptive Filtering
    Jiang, Xingyu
    Ma, Jiayi
    Fan, Aoxiang
    Xu, Haiping
    Lin, Geng
    Lu, Tao
    Tian, Xin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (02): : 1577 - 1591
  • [20] Correspondence Attention Transformer: A Context-Sensitive Network for Two-View Correspondence Learning
    Ma, Jiayi
    Wang, Yang
    Fan, Aoxiang
    Xiao, Guobao
    Chen, Riqing
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3509 - 3524