Transformer With Linear-Window Attention for Feature Matching

被引：0

作者：

Shen, Zhiwei ^{[1
,2
]}

Kong, Bin ^{[1
,3
,4
]}

Dong, Xiaoyu ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Hefei Inst Intelligent Machines, Hefei 230031, Peoples R China

[2] Univ Sci & Technol China, Hefei Inst Phys Sci, Hefei 230026, Peoples R China

[3] Anhui Engn Lab Intelligent Driving Technol & Appli, Hefei 230088, Peoples R China

[4] Chinese Acad Sci, Innovat Res Inst Robot & Intelligent Mfg Hefei, Hefei 230088, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Feature extraction; Transformers; Task analysis; Computational modeling; Computational efficiency; Memory management; Visualization; Feature matching; visual transformer; detector-free; computational complexity; low-texture;

D O I：

10.1109/ACCESS.2023.3328855

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A transformer can capture long-term dependencies through an attention mechanism, and hence, can be applied to various vision tasks. However, its secondary computational complexity is a major obstacle in vision tasks that require accurate predictions. To address this limitation, this study introduces linear-window attention (LWA), a new attention model for a vision transformer. The transformer computes self-attention that is restricted to nonoverlapping local windows and represented as a linear dot product of kernel feature mappings. Furthermore, the computational complexity of each window is reduced to linear from quadratic using the constraint property of matrix products. In addition, we applied the LWA to feature matching to construct a coarse-to-fine-level detector-free feature matching method, called transformer with linear-window attention for feature matching TRLWAM. At the coarse level, we extracted the dense pixel-level matches, and at the fine level, we obtained the final matching results via multi-head multilayer perceptron refinement. We demonstrated the effectiveness of LWA through Replace experiments. The results showed that the TRLWAM could extract dense matches from low-texture or repetitive pattern regions in indoor environments, and exhibited excellent results with a low computational cost for MegaDepth and HPatches datasets. We believe the proposed LWA can provide new conceptions for transformer applications in visual tasks.

引用

页码：121202 / 121211

页数：10

共 50 条

[1] FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification
Yoo, Dayeon
Kim, Jeesu
Yoo, Jinwoo
IEEE ACCESS, 2024, 12 : 72598 - 72606
[2] PT-Net: Pyramid Transformer Network for Feature Matching Learning
Gong, Zhepeng
Xiao, Guobao
Shi, Ziwei
Wang, Shiping
Chen, Riqing
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 11
[3] Densely Connected Transformer With Linear Self-Attention for Lightweight Image Super-Resolution
Zeng, Kun
Lin, Hanjiang
Yan, Zhiqiang
Fang, Jinsheng
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[4] Efficient Stereo Matching Using Swin Transformer and Multilevel Feature Consistency in Autonomous Mobile Systems
Su, Xiaojie
Liu, Shimin
Li, Rui
Bing, Zhenshan
Knoll, Alois
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (05) : 7957 - 7965
[5] Hierarchical Feature Aggregation Based on Transformer for Image-Text Matching
Dong, Xinfeng
Zhang, Huaxiang
Zhu, Lei
Nie, Liqiang
Liu, Li
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) : 6437 - 6447
[6] AFMtrack: Attention-Based Feature Matching for Multiple Object Tracking
Cuong Bui, Duy
Anh Hoang, Hiep
Yoo, Myungsik
IEEE ACCESS, 2024, 12 : 82897 - 82910
[7] A Hierarchical Consensus Attention Network for Feature Matching of Remote Sensing Images
Chen, Shuang
Chen, Jiaxuan
Rao, Yujing
Chen, Xiaoxian
Fan, Xiaoyan
Bai, Haicheng
Xing, Lin
Zhou, Chengjiang
Yang, Yang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[8] Local Window Attention Transformer for Polarimetric SAR Image Classification
Jamali, Ali
Roy, Swalpa Kumar
Bhattacharya, Avik
Ghamisi, Pedram
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[9] Image Super-Resolution With Unified-Window Attention
Cho, Gunhee
Choi, Yong Suk
IEEE ACCESS, 2024, 12 : 30852 - 30866
[10] MSGA-Net: Progressive Feature Matching via Multi-Layer Sparse Graph Attention
Gong, Zhepeng
Xiao, Guobao
Shi, Ziwei
Chen, Riqing
Yu, Jun
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5765 - 5775

← 1 2 3 4 5 →