PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

被引：0

作者：

Guo, Yangyang ^{[1
]}

Wang, Guangzhi ^{[1
]}

Kankanhalli, Mohan ^{[1
]}

机构：

[1] Natl Univ Singapore, Singapore, Singapore

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

基金：

新加坡国家研究基金会;

关键词：

D O I：

10.1109/CVPR52733.2024.01486

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applying a pre-trained large model to downstream tasks is prohibitive under resource-constrained conditions. Re-cent dominant approaches for addressing efficiency issues involve adding a few learnable parameters to the fixed backbone model. This strategy, however, leads to more challenges in loading large models for downstream fine-tuning with limited resources. In this paper, we propose a novel method for increasing the parameter efficiency of pre-trained models by introducing an intermediate pre-training stage. To this end, we first employ low-rank approximation to compress the original large model and then devise a feature distillation module and a weight perturbation regularization module. These modules are specifically designed to enhance the low-rank model. In particular, we update only the low-rank model while freezing the backbone parameters during pre-training. This allows for direct and efficient utilization of the low-rank model for downstream fine-tuning tasks. The proposed method achieves both efficiencies in terms of required parameters and computation time while maintaining comparable results with minimal modifications to the backbone architecture. Specifically, when applied to three vision-only and one vision-language Transformer models, our approach often demonstrates a merely similar to 0.6 point decrease in performance while reducing the original parameter size by 1/3 to 2/3. We release our code at link.

引用

页码：15699 / 15709

页数：11

共 50 条

[21] LOW-RANK PHYSICAL MODEL RECOVERY FROM LOW-RANK SIGNAL APPROXIMATION
Hayes, Charles Ethan
McClellan, James H.
Scott, Waymond R., Jr.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 3131 - 3135
[22] A low-rank spectral method for learning Markov models
Shujun Bi
Zhen Yin
Yihong Weng
Optimization Letters, 2023, 17 : 143 - 162
[23] Multi-Learning Generalised Low-Rank Models
Buet-Golfouse, Francois
Pahwa, Parth
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 860 - 865
[24] Learning Markov Models Via Low-Rank Optimization
Zhu, Ziwei
Li, Xudong
Wang, Mengdi
Zhang, Anru
OPERATIONS RESEARCH, 2022, 70 (04) : 2384 - 2398
[25] A low-rank spectral method for learning Markov models
Bi, Shujun
Yin, Zhen
Weng, Yihong
OPTIMIZATION LETTERS, 2023, 17 (01) : 143 - 162
[26] Leveraging Low-Rank Adaptation for Parameter-Efficient Fine-Tuning in Multi-Speaker Adaptive Text-to-Speech Synthesis
Hong, Changi
Lee, Jung Hyuk
Kim, Hong Kook
IEEE ACCESS, 2024, 12 : 190711 - 190727
[27] An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation
Anderson, David
Gu, Ming
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[28] Efficient Low-Rank Approximation of Matrices Based on Randomized Pivoted Decomposition
Kaloorazi, Maboud F.
Chen, Jie
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 3575 - 3589
[29] Efficient low-rank approximation of the stochastic Galerkin matrix in tensor formats
Espig, Mike
Hackbusch, Wolfgang
Litvinenko, Alexander
Matthies, Hermann G.
Waehnert, Philipp
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2014, 67 (04) : 818 - 829
[30] Efficient quaternion CUR method for low-rank approximation to quaternion matrix
Wu, Pengling
Kou, Kit Ian
Cai, Hongmin
Yu, Zhaoyuan
NUMERICAL ALGORITHMS, 2024,

← 1 2 3 4 5 →