LightFormer: Light-weight Transformer Using SVD-based Weight Transfer and Parameter Sharing

被引:0
作者
Lu, Xiuqing [1 ]
Zhang, Peng [1 ]
Li, Sunzhu [1 ]
Gan, Guobing [1 ]
Sun, Yueheng [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer has become an important technique for natural language processing tasks with great success. However, it usually requires huge storage space and computational cost, making it difficult to be deployed on resource-constrained edge devices. To compress and accelerate Transformer, we propose LightFormer, which adopts a low-rank factorization initialized by SVD-based weight transfer and parameter sharing. The SVD-based weight transfer can effectively utilize the well-trained Transformer parameter knowledge to speed up the model convergence, and effectively alleviate the low-rank bottleneck problem combined with parameter sharing. We validate this method on machine translation, text summarization, and text classification tasks. Experiments show that on IWSLT'14 De-En and WMT'14 En-De, LightFormer achieves similar performance to the baseline Transformer with 3.8x and 1.8x fewer parameters, and achieves 2.3x speedup and 1.5x speedup respectively, generally out-performing recent light-weight Transformers.
引用
收藏
页码:10323 / 10335
页数:13
相关论文
共 50 条
[21]   Light-weight Localization for Vehicles using Road Markings [J].
Ranganathan, Ananth ;
Ilstrup, David ;
Wu, Tao .
2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, :921-927
[22]   A light-weight framework for location-based services [J].
Schwinger, W ;
Grün, C ;
Pröll, B ;
Retschitzegger, W .
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2005: OTM 2005 WORKSHOPS, PROCEEDINGS, 2005, 3762 :206-210
[23]   A Light-weight Deep Feature based Capsule Network [J].
Singh, Chandan Kumar ;
Gangwar, Vivek Kumar ;
Majumder, Anima ;
Kumar, Swagat ;
Ambwani, Prakash Chanderlal ;
Sinha, Rajesh .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[24]   Light-Weight SMT-based Model Checking [J].
Ghilardi, Silvio ;
Ranise, Silvio ;
Valsecchi, Thomas .
ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2009, 250 (02) :85-102
[25]   Light-weight resource leak testing based on finalisers [J].
Dai, Ziying ;
Mao, Xiaoguang .
IET SOFTWARE, 2013, 7 (06) :308-316
[26]   LITNet: A Light-weight Image Transform Net for Image Style Transfer [J].
Shi, Huihong ;
Mao, Wendong ;
Wang, Zhongfeng .
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[27]   Small-Size Light-Weight Transformer with New Core Structure for Contactless Electric Vehicle Power Transfer System [J].
Chigira, Masato ;
Nagatsuka, Yuichi ;
Kaneko, Yasuyoshi ;
Abe, Shigeru ;
Yasuda, Tomio ;
Suzuki, Akira .
2011 IEEE ENERGY CONVERSION CONGRESS AND EXPOSITION (ECCE), 2011, :260-266
[28]   Measurement of the Moisture and Heat Transfer Rate in Light-weight Nonwoven Fabrics Using an Intelligent Model [J].
Rahnama, Mehrnoosh ;
Semnani, Dariush ;
Zarrebini, Mohammad .
FIBRES & TEXTILES IN EASTERN EUROPE, 2013, 21 (06) :89-94
[29]   PSLT: A Light-Weight Vision Transformer With Ladder Self-Attention and Progressive Shift [J].
Wu, Gaojie ;
Zheng, Wei-Shi ;
Lu, Yutong ;
Tian, Qi .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) :11120-11135
[30]   Experimental Damage Identification Using SVD-Based Sensitivities of Truncated Transfer Function [J].
Rahai, Mohammad ;
Bakhshi, Ali ;
Esfandiari, Akbar .
EXPERIMENTAL VIBRATION ANALYSIS FOR CIVIL STRUCTURES: TESTING, SENSING, MONITORING, AND CONTROL, 2018, 5 :156-166