DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention

被引：0

作者：

Qin, Bosheng ^{[1
]}

Li, Juncheng ^{[1
]}

Tang, Siliang ^{[1
]}

Zhuang, Yueting ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年

基金：

中国国家自然科学基金;

关键词：

Transformers; Complexity theory; Attention mechanisms; Memory management; Training; Kernel; Sparse matrices; Optimization; Learning systems; Image coding; Bilinear optimization; dynamic compression; efficient transformer; low-rank attention;

D O I：

10.1109/TNNLS.2025.3527046

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many studies have aimed to improve Transformer model efficiency using low-rank-based methods that compress sequence length with predetermined or learned compression matrices. However, these methods fix compression coefficients for tokens in the same position during inference, ignoring sequence-specific variations. They also overlook the impact of hidden state dimensions on efficiency gains. To address these limitations, we propose dynamic bilinear low-rank attention (DBA), an efficient and effective attention mechanism that compresses sequence length using input-sensitive dynamic compression matrices. DBA achieves linear time and space complexity by jointly optimizing sequence length and hidden state dimension while maintaining state-of-the-art performance. Specifically, we demonstrate through experiments and the properties of low-rank matrices that sequence length can be compressed with compression coefficients dynamically determined by the input sequence. In addition, we illustrate that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, thereby introducing only a small amount of error. DBA optimizes the attention mechanism through bilinear forms that consider both the sequence length and hidden state dimension. Moreover, the theoretical analysis substantiates that DBA excels at capturing high-order relationships in cross-attention problems. Experimental results across different tasks with varied sequence length conditions demonstrate that DBA achieves state-of-the-art performance compared to several robust baselines. DBA also maintains higher processing speed and lower memory usage, highlighting its efficiency and effectiveness across diverse applications.

引用

页数：15

共 50 条

[21] Low-rank matrix factorization with nonconvex regularization and bilinear decomposition
Wang, Sijie
Xia, Kewen
Wang, Li
Yin, Zhixian
He, Ziping
Zhang, Jiangnan
Aslam, Naila
SIGNAL PROCESSING, 2022, 201
[22] LOW-RANK APPROXIMATIONS FOR DYNAMIC IMAGING
Haldar, Justin P.
Liang, Zhi-Pei
2011 8TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2011, : 1052 - 1055
[23] Efficient Dynamic Parallel MRI Reconstruction for the Low-Rank Plus Sparse Model
Lin, Claire Yilin
Fessler, Jeffrey A.
IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2019, 5 (01) : 17 - 26
[24] An efficient algorithm for dynamic MRI using low-rank and total variation regularizations
Yao, Jiawen
Xu, Zheng
Huang, Xiaolei
Huang, Junzhou
MEDICAL IMAGE ANALYSIS, 2018, 44 : 14 - 27
[25] DSFormer-LRTC: Dynamic Spatial Transformer for Traffic Forecasting With Low-Rank Tensor Compression
Zhao, Jianli
Zhuo, Futong
Sun, Qiuxia
Li, Qing
Hua, Yiran
Zhao, Jianye
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 16323 - 16335
[26] Attention-Guided Low-Rank Tensor Completion
Truong Thanh Nhat Mai
Lam, Edmund Y.
Lee, Chul
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 9818 - 9833
[27] Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Chen, Beidi
Dao, Tri
Winsor, Eric
Song, Zhao
Rudra, Atri
Re, Christopher
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[28] CROSS: EFFICIENT LOW-RANK TENSOR COMPLETION
Zhang, Anru
ANNALS OF STATISTICS, 2019, 47 (02): : 936 - 964
[29] EFFICIENT LEARNING OF DICTIONARIES WITH LOW-RANK ATOMS
Ravishankar, Saiprasad
Moore, Brian E.
Nadakuditi, Raj Rao
Fessler, Jeffrey A.
2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 222 - 226
[30] LOW-RANK MATRIX RECOVERY OF DYNAMIC EVENTS
Asif, M. Salman
2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1215 - 1219

← 1 2 3 4 5 →