DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention

被引:0
|
作者
Qin, Bosheng [1 ]
Li, Juncheng [1 ]
Tang, Siliang [1 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Complexity theory; Attention mechanisms; Memory management; Training; Kernel; Sparse matrices; Optimization; Learning systems; Image coding; Bilinear optimization; dynamic compression; efficient transformer; low-rank attention;
D O I
10.1109/TNNLS.2025.3527046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many studies have aimed to improve Transformer model efficiency using low-rank-based methods that compress sequence length with predetermined or learned compression matrices. However, these methods fix compression coefficients for tokens in the same position during inference, ignoring sequence-specific variations. They also overlook the impact of hidden state dimensions on efficiency gains. To address these limitations, we propose dynamic bilinear low-rank attention (DBA), an efficient and effective attention mechanism that compresses sequence length using input-sensitive dynamic compression matrices. DBA achieves linear time and space complexity by jointly optimizing sequence length and hidden state dimension while maintaining state-of-the-art performance. Specifically, we demonstrate through experiments and the properties of low-rank matrices that sequence length can be compressed with compression coefficients dynamically determined by the input sequence. In addition, we illustrate that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, thereby introducing only a small amount of error. DBA optimizes the attention mechanism through bilinear forms that consider both the sequence length and hidden state dimension. Moreover, the theoretical analysis substantiates that DBA excels at capturing high-order relationships in cross-attention problems. Experimental results across different tasks with varied sequence length conditions demonstrate that DBA achieves state-of-the-art performance compared to several robust baselines. DBA also maintains higher processing speed and lower memory usage, highlighting its efficiency and effectiveness across diverse applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Low-rank matrix factorization with nonconvex regularization and bilinear decomposition
    Wang, Sijie
    Xia, Kewen
    Wang, Li
    Yin, Zhixian
    He, Ziping
    Zhang, Jiangnan
    Aslam, Naila
    SIGNAL PROCESSING, 2022, 201
  • [22] LOW-RANK APPROXIMATIONS FOR DYNAMIC IMAGING
    Haldar, Justin P.
    Liang, Zhi-Pei
    2011 8TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2011, : 1052 - 1055
  • [23] Efficient Dynamic Parallel MRI Reconstruction for the Low-Rank Plus Sparse Model
    Lin, Claire Yilin
    Fessler, Jeffrey A.
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2019, 5 (01) : 17 - 26
  • [24] An efficient algorithm for dynamic MRI using low-rank and total variation regularizations
    Yao, Jiawen
    Xu, Zheng
    Huang, Xiaolei
    Huang, Junzhou
    MEDICAL IMAGE ANALYSIS, 2018, 44 : 14 - 27
  • [25] DSFormer-LRTC: Dynamic Spatial Transformer for Traffic Forecasting With Low-Rank Tensor Compression
    Zhao, Jianli
    Zhuo, Futong
    Sun, Qiuxia
    Li, Qing
    Hua, Yiran
    Zhao, Jianye
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 16323 - 16335
  • [26] Attention-Guided Low-Rank Tensor Completion
    Truong Thanh Nhat Mai
    Lam, Edmund Y.
    Lee, Chul
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 9818 - 9833
  • [27] Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
    Chen, Beidi
    Dao, Tri
    Winsor, Eric
    Song, Zhao
    Rudra, Atri
    Re, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [28] CROSS: EFFICIENT LOW-RANK TENSOR COMPLETION
    Zhang, Anru
    ANNALS OF STATISTICS, 2019, 47 (02): : 936 - 964
  • [29] EFFICIENT LEARNING OF DICTIONARIES WITH LOW-RANK ATOMS
    Ravishankar, Saiprasad
    Moore, Brian E.
    Nadakuditi, Raj Rao
    Fessler, Jeffrey A.
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 222 - 226
  • [30] LOW-RANK MATRIX RECOVERY OF DYNAMIC EVENTS
    Asif, M. Salman
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1215 - 1219