DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention

被引:0
|
作者
Qin, Bosheng [1 ]
Li, Juncheng [1 ]
Tang, Siliang [1 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Complexity theory; Attention mechanisms; Memory management; Training; Kernel; Sparse matrices; Optimization; Learning systems; Image coding; Bilinear optimization; dynamic compression; efficient transformer; low-rank attention;
D O I
10.1109/TNNLS.2025.3527046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many studies have aimed to improve Transformer model efficiency using low-rank-based methods that compress sequence length with predetermined or learned compression matrices. However, these methods fix compression coefficients for tokens in the same position during inference, ignoring sequence-specific variations. They also overlook the impact of hidden state dimensions on efficiency gains. To address these limitations, we propose dynamic bilinear low-rank attention (DBA), an efficient and effective attention mechanism that compresses sequence length using input-sensitive dynamic compression matrices. DBA achieves linear time and space complexity by jointly optimizing sequence length and hidden state dimension while maintaining state-of-the-art performance. Specifically, we demonstrate through experiments and the properties of low-rank matrices that sequence length can be compressed with compression coefficients dynamically determined by the input sequence. In addition, we illustrate that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, thereby introducing only a small amount of error. DBA optimizes the attention mechanism through bilinear forms that consider both the sequence length and hidden state dimension. Moreover, the theoretical analysis substantiates that DBA excels at capturing high-order relationships in cross-attention problems. Experimental results across different tasks with varied sequence length conditions demonstrate that DBA achieves state-of-the-art performance compared to several robust baselines. DBA also maintains higher processing speed and lower memory usage, highlighting its efficiency and effectiveness across diverse applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Provably Efficient Algorithm for Nonstationary Low-Rank MDPs
    Cheng, Yuan
    Yang, Jing
    Liang, Yingbin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [42] Efficient low-rank solution of generalized Lyapunov equations
    Shank, Stephen D.
    Simoncini, Valeria
    Szyld, Daniel B.
    NUMERISCHE MATHEMATIK, 2016, 134 (02) : 327 - 342
  • [43] COMPACTER: Efficient Low-Rank Hypercomplex Adapter Layers
    Mahabadi, Rabeeh Karimi
    Henderson, James
    Ruder, Sebastian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [44] A Fast and Efficient Algorithm for Low-rank Approximation of a Matrix
    Nguyen, Nam H.
    Do, Thong T.
    Tran, Trac D.
    STOC'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2009, : 215 - 224
  • [45] An efficient Kalman filter for the identification of low-rank systems
    Dogariu, Laura-Maria
    Paleologu, Constantin
    Benesty, Jacob
    Ciochina, Silviu
    SIGNAL PROCESSING, 2020, 166
  • [46] Efficient methods for grouping vectors into low-rank clusters
    Rangan, Aaditya V.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2011, 230 (14) : 5684 - 5703
  • [47] Efficient Low-Rank Spectrotemporal Decomposition using ADMM
    Schamberg, Gabriel
    Ba, Demba
    Wagner, Mark
    Coleman, Todd
    2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016,
  • [48] Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning
    Jin, Tao
    Huang, Siyu
    Li, Yingming
    Zhang, Zhongfei
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2001 - 2011
  • [49] StruNet: Perceptual and low-rank regularized transformer for medical image denoising
    Ma, Yuhui
    Yan, Qifeng
    Liu, Yonghuai
    Liu, Jiang
    Zhang, Jiong
    Zhao, Yitian
    MEDICAL PHYSICS, 2023, 50 (12) : 7654 - 7669
  • [50] Low-Rank Transformer for High-Resolution Hyperspectral Computational Imaging
    Liu, Yuanye
    Dian, Renwei
    Li, Shutao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 809 - 824