Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

被引：0

作者：

Hao, Yongchang ^{[1
]}

He, Shilin ^{[2
]}

Jiao, Wenxiang ^{[2
]}

Tu, Zhaopeng ^{[3
]}

Lyu, Michael R. ^{[2
]}

Wang, Xing ^{[3
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China

[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[3] Tencent AI Lab, Bellevue, WA USA

来源：

2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) knowledge to NAT models, e.g., with knowledge distillation. In this work, we hypothesize and empirically verify that AT and NAT encoders capture different linguistic properties of source sentences. Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to NAT models through encoder sharing. Specifically, we take the AT model as an auxiliary task to enhance NAT model performance. Experimental results on WMT14 English <-> German and WMT16 English <-> Romanian datasets show that the proposed MULTI-TASK NAT achieves significant improvements over the baseline NAT models. Furthermore, the performance on large-scale WMT19 and WMT20 English <-> German datasets confirm the consistency of our proposed method. In addition, experimental results demonstrate that our MULTI-TASK NAT is complementary to knowledge distillation, the standard knowledge transfer method for NAT.(1)

引用

页码：3989 / 3996

页数：8

共 50 条

[41] Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
Du, Cunxiao
Tu, Zhaopeng
Jiang, Jing
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[42] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
Wang, Shuheng
Shi, Shumin
Huang, Heyan
SOFT COMPUTING, 2024, 28 (5) : 4681 - 4688
[43] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
Shuheng Wang
Shumin Shi
Heyan Huang
Soft Computing, 2024, 28 : 4681 - 4688
[44] Revisiting Non-Autoregressive Translation at Scale
Wang, Zhihao
Wang, Longyue
Su, Jinsong
Yao, Junfeng
Tu, Zhaopeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12051 - 12065
[45] Sequence-Level Training for Non-Autoregressive Neural Machine Translation
Shao, Chenze
Feng, Yang
Zhang, Jinchao
Meng, Fandong
Zhou, Jie
COMPUTATIONAL LINGUISTICS, 2021, 47 (04) : 891 - 925
[46] Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization
Chen, Xinran
Duan, Sufeng
Liu, Gongshen
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 240 - 252
[47] Neural Machine Translation Based on Multi-task Learning of Discourse Structure
Kang X.-M.
Zong C.-Q.
Ruan Jian Xue Bao/Journal of Software, 2022, 33 (10): : 3806 - 3818
[48] Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
Lee, Jason
Shu, Raphael
Cho, Kyunghyun
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1006 - 1015
[49] Improving Machine Translation of Arabic Dialects Through Multi-task Learning
Moukafih, Youness
Sbihi, Nada
Ghogho, Mounir
Smaili, Kamel
AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 580 - 590
[50] Non-autoregressive Machine Translation with Probabilistic Context-free Grammar
Gui, Shangtong
Shao, Chenze
Ma, Zhengrui
Zhang, Xishan
Chen, Yunji
Feng, Yang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →