DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention

被引:0
|
作者
Qin, Bosheng [1 ]
Li, Juncheng [1 ]
Tang, Siliang [1 ]
Zhuang, Yueting [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Complexity theory; Attention mechanisms; Memory management; Training; Kernel; Sparse matrices; Optimization; Learning systems; Image coding; Bilinear optimization; dynamic compression; efficient transformer; low-rank attention;
D O I
10.1109/TNNLS.2025.3527046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many studies have aimed to improve Transformer model efficiency using low-rank-based methods that compress sequence length with predetermined or learned compression matrices. However, these methods fix compression coefficients for tokens in the same position during inference, ignoring sequence-specific variations. They also overlook the impact of hidden state dimensions on efficiency gains. To address these limitations, we propose dynamic bilinear low-rank attention (DBA), an efficient and effective attention mechanism that compresses sequence length using input-sensitive dynamic compression matrices. DBA achieves linear time and space complexity by jointly optimizing sequence length and hidden state dimension while maintaining state-of-the-art performance. Specifically, we demonstrate through experiments and the properties of low-rank matrices that sequence length can be compressed with compression coefficients dynamically determined by the input sequence. In addition, we illustrate that the hidden state dimension can be approximated by extending the Johnson-Lindenstrauss lemma, thereby introducing only a small amount of error. DBA optimizes the attention mechanism through bilinear forms that consider both the sequence length and hidden state dimension. Moreover, the theoretical analysis substantiates that DBA excels at capturing high-order relationships in cross-attention problems. Experimental results across different tasks with varied sequence length conditions demonstrate that DBA achieves state-of-the-art performance compared to several robust baselines. DBA also maintains higher processing speed and lower memory usage, highlighting its efficiency and effectiveness across diverse applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Efficient Optimization for Low-Rank Integrated Bilinear Classifiers
    Kobayashi, Takumi
    Otsu, Nobuyuki
    COMPUTER VISION - ECCV 2012, PT II, 2012, 7573 : 474 - 487
  • [2] Low-Rank Bilinear Classification: Efficient Convex Optimization and Extensions
    Takumi Kobayashi
    International Journal of Computer Vision, 2014, 110 : 308 - 327
  • [3] Low-Rank Bilinear Classification: Efficient Convex Optimization and Extensions
    Kobayashi, Takumi
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 110 (03) : 308 - 327
  • [4] Efficient Low-rank Backpropagation for Vision Transformer Adaptation
    Yang, Yuedong
    Chiang, Hung-Yueh
    Li, Guihong
    Marculescu, Diana
    Marculescu, Radu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Bilinear Bandits with Low-rank Structure
    Jun, Kwang-Sung
    Willett, Rebecca
    Wright, Stephen
    Nowak, Robert
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] TriAxial Low-Rank Transformer for Efficient Medical Image Segmentation
    Shang, Jiang
    Fang, Xi
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT II, 2024, 14426 : 91 - 102
  • [7] Low-rank and global-representation-key-based attention for graph transformer
    Kong, Lingping
    Ojha, Varun
    Gao, Ruobin
    Suganthan, Ponnuthurai Nagaratnam
    Snasel, Vaclav
    INFORMATION SCIENCES, 2023, 642
  • [8] LRTD: A Low-rank Transformer with Dynamic Depth and Width for Speech Recognition
    Yu, Fan
    Xi, Wei
    Yang, Zhao
    Tong, Ziye
    Sun, Jingtong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [9] Dynamic Low-rank Estimation for Transformer-based Language Models
    Huai, Ting
    Lie, Xiao
    Gao, Shangqian
    Hsu, Yenchang
    Shen, Yilin
    Jin, Hongxia
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9275 - 9287
  • [10] Efficient Dynamic Skinning with Low-Rank Helper Bone Controllers
    Mukai, Tomohiko
    Kuriyama, Shigeru
    ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (04):