Fully Quantized Transformer for Machine Translation

被引:0
|
作者
Prato, Gabriele [1 ]
Charlaix, Ella [2 ]
Rezagholizadeh, Mehdi [2 ]
机构
[1] Univ Montreal, Mila, Montreal, PQ, Canada
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020 | 2020年
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art neural machine translation methods employ massive amounts of parameters. Drastically reducing computational costs of such methods without affecting performance has been up to this point unsuccessful. To this end, we propose FullyQT: an allinclusive quantization strategy for the Transformer. To the best of our knowledge, we are the first to show that it is possible to avoid any loss in translation quality with a fully quantized Transformer. Indeed, compared to fullprecision, our 8-bit models score greater or equal BLEU on most tasks. Comparing ourselves to all previously proposed methods, we achieve state-of-the-art quantization results.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] Fast Streaming Translation Using Machine Learning with Transformer
    Qiu, Jiabao
    Moh, Melody
    Moh, Teng-Sheng
    ACMSE 2021: PROCEEDINGS OF THE 2021 ACM SOUTHEAST CONFERENCE, 2021, : 9 - 16
  • [2] Progressive Transformer Machine for Natural Character Reenactment
    Xu, Yongzong
    Yang, Zhijing
    Chen, Tianshui
    Li, Kai
    Qing, Chunmei
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [3] Investigating the roles of sentiment in machine translation
    Mahata, Sainik Kumar
    Das, Dipankar
    Bandyopadhyay, Sivaji
    MACHINE TRANSLATION, 2021, 35 (04) : 687 - 709
  • [4] Neural machine translation for Tamil to English
    Jain, Minni
    Punia, Ravneet
    Hooda, Ishika
    JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2020, 23 (07) : 1251 - 1264
  • [5] Neural Machine Translation of Indian Languages
    Revanuru, Karthik
    Turlapaty, Kaushik
    Rao, Shrisha
    COMPUTE'17: PROCEEDINGS OF THE 10TH ANNUAL ACM INDIA COMPUTE CONFERENCE, 2017, : 11 - 20
  • [6] Neural Machine Translation with Deep Attention
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
  • [7] Neuroadptive Practical Prescribed Time Tracking Control of Fully Quantized Uncertain Strict Feedback Systems
    Gao, Zhen
    Song, Yongduan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (04) : 2144 - 2148
  • [8] Translating Akkadian to English with neural machine translation
    Gutherz, Gai
    Gordin, Shai
    Saenz, Luis
    Levy, Omer
    Berant, Jonathan
    PNAS NEXUS, 2023, 2 (05):
  • [9] Discriminant training of neural networks for machine translation
    Quoc-Khanh Do
    Allauzen, Alexandre
    Yvon, Francois
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2016, 57 (01): : 111 - 135
  • [10] Expressive ontology learning as neural machine translation
    Petrucci, Giulio
    Rospocher, Marco
    Ghidini, Chiara
    JOURNAL OF WEB SEMANTICS, 2018, 52-53 : 66 - 82