Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

被引:66
|
作者
Ganesh, Prakhar [1 ]
Chen, Yao [1 ]
Lou, Xin [1 ]
Khan, Mohammad Ali [1 ]
Yang, Yin [2 ]
Sajjad, Hassan [3 ]
Nakov, Preslav [3 ]
Chen, Deming [4 ]
Winslett, Marianne [4 ]
机构
[1] Adv Digital Sci Ctr, Singapore, Singapore
[2] Hamad Bin Khalifa Univ, Coll Sci & Engn, Ar Rayyan, Qatar
[3] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Ar Rayyan, Qatar
[4] Univ Illinois, Urbana, IL USA
基金
新加坡国家研究基金会;
关键词
All Open Access; Gold; Green;
D O I
10.1162/tacl_a_00413
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resourcehungry and computation-intensive to suit lowcapability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.
引用
收藏
页码:1061 / 1080
页数:20
相关论文
共 50 条
  • [1] An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models
    Ganiev, Amir
    Chapin, Colt
    de Andrade, Anderson
    Liu, Chen
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 163 - 169
  • [2] TRANSFORMER IN ACTION: A COMPARATIVE STUDY OF TRANSFORMER-BASED ACOUSTIC MODELS FOR LARGE SCALE SPEECH RECOGNITION APPLICATIONS
    Wang, Yongqiang
    Shi, Yangyang
    Zhang, Frank
    Wu, Chunyang
    Chan, Julian
    Yeh, Ching-Feng
    Xiao, Alex
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6778 - 6782
  • [3] Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study
    Wagner, Sophia J.
    Reisenbuechler, Daniel
    West, Nicholas P.
    Niehues, Jan Moritz
    Zhu, Jiefu
    Foersch, Sebastian
    Veldhuizen, Gregory Patrick
    Quirke, Philip
    Grabsch, Heike I.
    van den Brandt, Piet A.
    Hutchins, Gordon G. A.
    Richman, Susan D.
    Yuan, Tanwei
    Langer, Rupert
    Jenniskens, Josien C. A.
    Offermans, Kelly
    Mueller, Wolfram
    Gray, Richard
    Gruber, Stephen B.
    Greenson, Joel K.
    Rennert, Gad
    Bonner, Joseph D.
    Schmolze, Daniel
    Jonnagaddala, Jitendra
    Hawkins, Nicholas J.
    Ward, Robyn L.
    Morton, Dion
    Seymour, Matthew
    Magill, Laura
    Nowak, Marta
    Hay, Jennifer
    Koelzer, Viktor H.
    Church, David N.
    Matek, Christian
    Geppert, Carol
    Peng, Chaolong
    Zhi, Cheng
    Ouyang, Xiaoming
    James, Jacqueline A.
    Loughrey, Maurice B.
    Salto-Tellez, Manuel
    Brenner, Hermann
    Hoffmeister, Michael
    Truhn, Daniel
    Schnabel, Julia A.
    Boxberg, Melanie
    Peng, Tingying
    Kather, Jakob Nikolas
    CANCER CELL, 2023, 41 (09) : 1650 - +
  • [4] AccTFM: An Effective Intra-Layer Model Parallelization Strategy for Training Large-Scale Transformer-Based Models
    Zeng, Zihao
    Liu, Chubo
    Tang, Zhuo
    Li, Kenli
    Li, Keqin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4326 - 4338
  • [5] Exploring Effective Approaches on Transformer-Based Neural Models for Multi-clinical Large-Scale Cardiotocogram Data
    Hemmi, Kazunari
    Shibata, Chihiro
    Miyata, Kohei
    Alkanan, Mohannad
    Miyamoto, Shingo
    Imamura, Toshiro
    Fukunishi, Hiroaki
    Numano, Hirotane
    ADVANCES IN DIGITAL HEALTH AND MEDICAL BIOENGINEERING, VOL 1, EHB-2023, 2024, 109 : 439 - 447
  • [6] Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
    Prakash, Prafull
    Shashidhar, Saurabh Kumar
    Zhao, Wenlong
    Rongali, Subendhu
    Khan, Haidar
    Kayser, Michael
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4711 - 4717
  • [7] A Large-scale Non-standard English Database and Transformer-based Translation System
    Kundu, Arghya
    Uyen Trang Nguyen
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 2472 - 2479
  • [8] Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny
    Rovida, Lorenzo
    Leporati, Alberto
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 3 - 13
  • [9] Cascaded transformer-based networks for wikipedia large-scale image-caption matching
    Messina, Nicola
    Coccomini, Davide Alessandro
    Esuli, Andrea
    Falchi, Fabrizio
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (23) : 62915 - 62935
  • [10] UAV Cross-Modal Image Registration: Large-Scale Dataset and Transformer-Based Approach
    Xiao, Yun
    Liu, Fei
    Zhu, Yabin
    Li, Chenglong
    Wang, Futian
    Tang, Jin
    ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2023, 2024, 14374 : 166 - 176