Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

被引:0
|
作者
Hao, Yongchang [1 ]
He, Shilin [2 ]
Jiao, Wenxiang [2 ]
Tu, Zhaopeng [3 ]
Lyu, Michael R. [2 ]
Wang, Xing [3 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) knowledge to NAT models, e.g., with knowledge distillation. In this work, we hypothesize and empirically verify that AT and NAT encoders capture different linguistic properties of source sentences. Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to NAT models through encoder sharing. Specifically, we take the AT model as an auxiliary task to enhance NAT model performance. Experimental results on WMT14 English <-> German and WMT16 English <-> Romanian datasets show that the proposed MULTI-TASK NAT achieves significant improvements over the baseline NAT models. Furthermore, the performance on large-scale WMT19 and WMT20 English <-> German datasets confirm the consistency of our proposed method. In addition, experimental results demonstrate that our MULTI-TASK NAT is complementary to knowledge distillation, the standard knowledge transfer method for NAT.(1)
引用
收藏
页码:3989 / 3996
页数:8
相关论文
共 50 条
  • [31] Non-autoregressive neural machine translation with auxiliary representation fusion
    Du, Quan
    Feng, Kai
    Xu, Chen
    Xiao, Tong
    Zhu, Jingbo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7229 - 7239
  • [32] NON-AUTOREGRESSIVE MACHINE TRANSLATION WITH A NOVEL MASKED LANGUAGE MODEL
    Li Ke
    Li Jie
    Wangjun
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [33] Hint-Based Training for Non-Autoregressive Machine Translation
    Li, Zhuohan
    Lin, Zi
    He, Di
    Tian, Fei
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5708 - 5713
  • [34] Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
    Gu, Jiatao
    Kong, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 120 - 133
  • [35] A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
    Xiao Y.
    Wu L.
    Guo J.
    Li J.
    Zhang M.
    Qin T.
    Liu T.-Y.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10) : 11407 - 11427
  • [36] Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
    Guo, Junliang
    Tan, Xu
    He, Di
    Qin, Tao
    Xu, Linli
    Liu, Tie-Yan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3723 - 3730
  • [37] Improving Robustness of Neural Machine Translation with Multi-task Learning
    Zhou, Shuyan
    Zeng, Xiangkai
    Zhou, Yingqi
    Anastasopoulos, Antonios
    Neubig, Graham
    FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 565 - 571
  • [38] Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Feng, Yang
    Zhang, Jinchao
    Meng, Fandong
    Chen, Xilin
    Zhou, Jie
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3013 - 3024
  • [39] Progressive Multi-Granularity Training for Non-Autoregressive Translation
    Ding, Liang
    Wang, Longyue
    Liu, Xuebo
    Wong, Derek F.
    Tao, Dacheng
    Tu, Zhaopeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2797 - 2803
  • [40] Correcting translation for non-autoregressive transformer
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    Li, Dongbai
    Guo, Dongen
    APPLIED SOFT COMPUTING, 2025, 168