Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

被引:0
|
作者
Hao, Yongchang [1 ]
He, Shilin [2 ]
Jiao, Wenxiang [2 ]
Tu, Zhaopeng [3 ]
Lyu, Michael R. [2 ]
Wang, Xing [3 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) knowledge to NAT models, e.g., with knowledge distillation. In this work, we hypothesize and empirically verify that AT and NAT encoders capture different linguistic properties of source sentences. Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to NAT models through encoder sharing. Specifically, we take the AT model as an auxiliary task to enhance NAT model performance. Experimental results on WMT14 English <-> German and WMT16 English <-> Romanian datasets show that the proposed MULTI-TASK NAT achieves significant improvements over the baseline NAT models. Furthermore, the performance on large-scale WMT19 and WMT20 English <-> German datasets confirm the consistency of our proposed method. In addition, experimental results demonstrate that our MULTI-TASK NAT is complementary to knowledge distillation, the standard knowledge transfer method for NAT.(1)
引用
收藏
页码:3989 / 3996
页数:8
相关论文
共 50 条
  • [41] Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
    Du, Cunxiao
    Tu, Zhaopeng
    Jiang, Jing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [42] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
    Wang, Shuheng
    Shi, Shumin
    Huang, Heyan
    SOFT COMPUTING, 2024, 28 (5) : 4681 - 4688
  • [43] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
    Shuheng Wang
    Shumin Shi
    Heyan Huang
    Soft Computing, 2024, 28 : 4681 - 4688
  • [44] Revisiting Non-Autoregressive Translation at Scale
    Wang, Zhihao
    Wang, Longyue
    Su, Jinsong
    Yao, Junfeng
    Tu, Zhaopeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12051 - 12065
  • [45] Sequence-Level Training for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Feng, Yang
    Zhang, Jinchao
    Meng, Fandong
    Zhou, Jie
    COMPUTATIONAL LINGUISTICS, 2021, 47 (04) : 891 - 925
  • [46] Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization
    Chen, Xinran
    Duan, Sufeng
    Liu, Gongshen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 240 - 252
  • [47] Neural Machine Translation Based on Multi-task Learning of Discourse Structure
    Kang X.-M.
    Zong C.-Q.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (10): : 3806 - 3818
  • [48] Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
    Lee, Jason
    Shu, Raphael
    Cho, Kyunghyun
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1006 - 1015
  • [49] Improving Machine Translation of Arabic Dialects Through Multi-task Learning
    Moukafih, Youness
    Sbihi, Nada
    Ghogho, Mounir
    Smaili, Kamel
    AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 580 - 590
  • [50] Non-autoregressive Machine Translation with Probabilistic Context-free Grammar
    Gui, Shangtong
    Shao, Chenze
    Ma, Zhengrui
    Zhang, Xishan
    Chen, Yunji
    Feng, Yang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,