LightFormer: Light-weight Transformer Using SVD-based Weight Transfer and Parameter Sharing

被引:0
|
作者
Lu, Xiuqing [1 ]
Zhang, Peng [1 ]
Li, Sunzhu [1 ]
Gan, Guobing [1 ]
Sun, Yueheng [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Transformer has become an important technique for natural language processing tasks with great success. However, it usually requires huge storage space and computational cost, making it difficult to be deployed on resource-constrained edge devices. To compress and accelerate Transformer, we propose LightFormer, which adopts a low-rank factorization initialized by SVD-based weight transfer and parameter sharing. The SVD-based weight transfer can effectively utilize the well-trained Transformer parameter knowledge to speed up the model convergence, and effectively alleviate the low-rank bottleneck problem combined with parameter sharing. We validate this method on machine translation, text summarization, and text classification tasks. Experiments show that on IWSLT'14 De-En and WMT'14 En-De, LightFormer achieves similar performance to the baseline Transformer with 3.8x and 1.8x fewer parameters, and achieves 2.3x speedup and 1.5x speedup respectively, generally out-performing recent light-weight Transformers.
引用
收藏
页码:10323 / 10335
页数:13
相关论文
共 50 条
  • [1] Towards Light-Weight Portrait Matting via Parameter Sharing
    Dai, Yutong
    Lu, Hao
    Shen, Chunhua
    COMPUTER GRAPHICS FORUM, 2021, 40 (01) : 151 - 164
  • [2] HOKES/POKES: Light-weight resource sharing
    Bos, H
    Samwel, B
    EMBEDDED SOFTWARE, PROCEEDINGS, 2003, 2855 : 51 - 66
  • [3] IMAGE-TEXT ALIGNMENT AND RETRIEVAL USING LIGHT-WEIGHT TRANSFORMER
    Li, Wenrui
    Fan, Xiaopeng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4758 - 4762
  • [4] Memristor-Based Light-Weight Transformer Circuit Implementation for Speech Recognizing
    Xiao, He
    Zhou, Yue
    Gao, Tongtong
    Duan, Shukai
    Chen, Guanrong
    Hu, Xiaofang
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2023, 13 (01) : 344 - 356
  • [5] Learning Light-Weight Translation Models from Deep Transformer
    Li, Bei
    Wang, Ziyang
    Liu, Hui
    Du, Quan
    Xiao, Tong
    Zhang, Chunliang
    Zhu, Jingbo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13217 - 13225
  • [6] A light-weight application sharing infrastructure for graphics intensive applications
    Hao, MC
    Lee, DM
    Sventek, JS
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 1996, : 127 - 131
  • [7] Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
    Minh Van Nguyen
    Viet Lai
    Ben Veyseh, Amir Pouran
    Thien Huu Nguyen
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE SYSTEM DEMONSTRATIONS, 2021, : 80 - 90
  • [8] A Light-Weight Vision Transformer Toward Near Memory Computation on an FPGA
    Senoo, Takeshi
    Kayanoma, Ryota
    Jinguji, Akira
    Nakahara, Hiroki
    APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2023, 2023, 14251 : 338 - 353
  • [9] Style Transfer of Abstract Drum Patterns Using a Light-Weight Hierarchical Autoencoder
    Voschezang, Mark
    ARTIFICIAL INTELLIGENCE, BNAIC 2018, 2019, 1021 : 121 - 137
  • [10] Exploring the Feature Extraction and Relation Modeling For Light-Weight Transformer Tracking
    Zheng, Jikai
    Liang, Mingjiang
    Huang, Shaoli
    Ning, Jifeng
    COMPUTER VISION - ECCV 2024, PT XXIX, 2025, 15087 : 110 - 126