Dynamic Low-rank Estimation for Transformer-based Language Models

被引:0
|
作者
Huai, Ting [1 ]
Lie, Xiao [2 ]
Gao, Shangqian [1 ]
Hsu, Yenchang [2 ]
Shen, Yilin [2 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
[2] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Matrix decomposition methods, such as Singular Value Decomposition (SVD) and its importance-weighted variants, have been widely used for compressing Transformerbased language models. While importanceweighted decomposition methods alleviate the strong assumption of equal importance for each parameter in SVD, they still rely on two fundamental assumptions: 1) unchanged importance distribution during further fine-tuning, 2) equal importance across weight matrices in different layers. Furthermore, these methods necessitate a well-trained task-specific model as the starting point and require additional fine-tuning after compression. In this work, we proposed RankDyna, a matrix decomposition method that enables dynamic rank resource allocation among matrices across different layers during the training process. Starting from a general pre-trained model, RankDyna accomplishes the dual goals of compression and adaptation to the downstream task, all within a single round of fine-tuning. The extensive evaluations demonstrate that RankDyna can outperform current SOTA methods under various parameter budget levels, and the advantage of RankDyna is further enhanced with higher compression rates.
引用
收藏
页码:9275 / 9287
页数:13
相关论文
共 50 条
  • [1] Leveraging Transformer-based autoencoders for low-rank multi-view subspace clustering
    Lin, Yuxiu
    Liu, Hui
    Yu, Xiao
    Zhang, Caiming
    PATTERN RECOGNITION, 2025, 161
  • [2] Arlo: Serving Transformer-based Language Models with Dynamic Input Lengths
    Tan, Xin
    Li, Jiamin
    Yang, Yitao
    Li, Jingzong
    Xu, Hong
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 367 - 376
  • [3] DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention
    Qin, Bosheng
    Li, Juncheng
    Tang, Siliang
    Zhuang, Yueting
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025,
  • [4] Ouroboros: On Accelerating Training of Transformer-Based Language Models
    Yang, Qian
    Huo, Zhouyuan
    Wang, Wenlin
    Huang, Heng
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Transformer-Based Language Models for Software Vulnerability Detection
    Thapa, Chandra
    Jang, Seung Ick
    Ahmed, Muhammad Ejaz
    Camtepe, Seyit
    Pieprzyk, Josef
    Nepal, Surya
    PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 481 - 496
  • [6] A Comparison of Transformer-Based Language Models on NLP Benchmarks
    Greco, Candida Maria
    Tagarelli, Andrea
    Zumpano, Ester
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2022), 2022, 13286 : 490 - 501
  • [7] RadBERT: Adapting Transformer-based Language Models to Radiology
    Yan, An
    McAuley, Julian
    Lu, Xing
    Du, Jiang
    Chang, Eric Y.
    Gentili, Amilcare
    Hsu, Chun-Nan
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (04)
  • [8] Applications of transformer-based language models in bioinformatics: a survey
    Zhang, Shuang
    Fan, Rui
    Liu, Yuti
    Chen, Shuang
    Liu, Qiao
    Zeng, Wanwen
    NEURO-ONCOLOGY ADVANCES, 2023, 5 (01)
  • [9] TAG: Gradient Attack on Transformer-based Language Models
    Deng, Jieren
    Wang, Yijue
    Li, Ji
    Wang, Chenghong
    Shang, Chao
    Liu, Hang
    Rajasekaran, Sanguthevar
    Ding, Caiwen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3600 - 3610
  • [10] Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow
    Qin, Yubin
    Wang, Yang
    Deng, Dazheng
    Yang, Xiaolong
    Zhao, Zhiren
    Zhou, Yang
    Fan, Yuanqi
    Wei, Jingchuan
    Chen, Tianbao
    Liu, Leibo
    Wei, Shaojun
    Hu, Yang
    Yin, Shouyi
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (10) : 3342 - 3356