Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

被引:66
|
作者
Ganesh, Prakhar [1 ]
Chen, Yao [1 ]
Lou, Xin [1 ]
Khan, Mohammad Ali [1 ]
Yang, Yin [2 ]
Sajjad, Hassan [3 ]
Nakov, Preslav [3 ]
Chen, Deming [4 ]
Winslett, Marianne [4 ]
机构
[1] Adv Digital Sci Ctr, Singapore, Singapore
[2] Hamad Bin Khalifa Univ, Coll Sci & Engn, Ar Rayyan, Qatar
[3] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Ar Rayyan, Qatar
[4] Univ Illinois, Urbana, IL USA
基金
新加坡国家研究基金会;
关键词
All Open Access; Gold; Green;
D O I
10.1162/tacl_a_00413
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resourcehungry and computation-intensive to suit lowcapability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.
引用
收藏
页码:1061 / 1080
页数:20
相关论文
共 50 条
  • [21] EEG Classification with Transformer-Based Models
    Sun, Jiayao
    Xie, Jin
    Zhou, Huihui
    2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021), 2021, : 92 - 93
  • [22] TMD-BERT: A Transformer-Based Model for Transportation Mode Detection
    Drosouli, Ifigenia
    Voulodimos, Athanasios
    Mastorocostas, Paris
    Miaoulis, Georgios
    Ghazanfarpour, Djamchid
    ELECTRONICS, 2023, 12 (03)
  • [23] Transformer-based models for multimodal irony detection
    Tomás D.
    Ortega-Bueno R.
    Zhang G.
    Rosso P.
    Schifanella R.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (6) : 7399 - 7410
  • [24] Transformer-based Extraction of Deep Image Models
    Battis, Verena
    Penner, Alexander
    2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 320 - 336
  • [25] Detecting Bot on GitHub Leveraging Transformer-based Models: A Preliminary Study
    Zhang, Jin
    Wu, Xingjin
    Zhang, Yang
    Xu, Shunyu
    PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023, 2023, : 639 - 640
  • [26] On Robustness of Finetuned Transformer-based NLP Models
    Neerudu, Pavan Kalyan Reddy
    Oota, Subba Reddy
    Marreddy, Mounika
    Kagita, Venkateswara Rao
    Gupta, Manish
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7180 - 7195
  • [27] Adaptation of Transformer-Based Models for Depression Detection
    Adebanji, Olaronke O.
    Ojo, Olumide E.
    Calvo, Hiram
    Gelbukh, Irina
    Sidorov, Grigori
    COMPUTACION Y SISTEMAS, 2024, 28 (01): : 151 - 165
  • [28] BERT-Caps: A Transformer-Based Capsule Network for Tweet Act Classification
    Saha, Tulika
    Ramesh Jayashree, Srivatsa
    Saha, Sriparna
    Bhattacharyya, Pushpak
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2020, 7 (05): : 1168 - 1179
  • [29] Personality BERT: A Transformer-Based Model for Personality Detection from Textual Data
    Jain, Dipika
    Kumar, Akshi
    Beniwal, Rohit
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 515 - 522
  • [30] Transformer-based semantic segmentation for large-scale building footprint extraction from very-high resolution satellite images
    Gibril, Mohamed Barakat A.
    Al-Ruzouq, Rami
    Shanableh, Abdallah
    Jena, Ratiranjan
    Bolcek, Jan
    Shafri, Helmi Zulhaidi Mohd
    Ghorbanzadeh, Omid
    ADVANCES IN SPACE RESEARCH, 2024, 73 (10) : 4937 - 4954