On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

被引:0
作者
Chen, Liang [1 ]
Ma, Shuming [2 ]
Zhang, Dongdong [2 ]
Wei, Furu [2 ]
Chang, Baobao [1 ]
机构
[1] Peking Univ, Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China
[2] Microsoft Res, Redmond, WA USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
学科分类号
摘要
While multilingual neural machine translation has achieved great success, it suffers from the off-target issue, where the translation is in the wrong language. This problem is more pronounced on zero-shot translation tasks. In this work, we find that failing in encoding discriminative target language signal will lead to off-target and a closer lexical distance (i.e., KL-divergence) between two languages' vocabularies is related with a higher off-target rate. We also find that solely isolating the vocab of different languages in the decoder can alleviate the problem. Motivated by the findings, we propose Language Aware Vocabulary Sharing (LAVS), a simple and effective algorithm to construct the multilingual vocabulary, that greatly alleviates the off-target problem of the translation model by increasing the KL-divergence between languages. We conduct experiments on a multilingual machine translation benchmark in 11 languages. Experiments show that the off-target rate for 90 translation tasks is reduced from 29% to 8%, while the overall BLEU score is improved by an average of 1.9 points without extra training cost or sacrificing the supervised directions' performance. We release the code at https://github.com/PKUnlpicler/Off-Target-MNMT for reproduction.
引用
收藏
页码:9542 / 9558
页数:17
相关论文
共 26 条
  • [1] Aharoni R, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P3874
  • [2] Arivazhagan Naveen, 2019, MASSIVELY MULTILINGU
  • [3] Callison-Burch Chris, 2010, P JOINT 5 WORKSH STA
  • [4] Chen L, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, P665
  • [5] Conneau Alexis, 2019, ARXIV
  • [6] Costa-jussa Marta R, 2022, arXiv
  • [7] Goyal Naman, 2021, FLORES 101 EVALUATIO
  • [8] Gu JT, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1258
  • [9] Guzmán F, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P6098
  • [10] Ha Thanh-Le, 2016, P 13 INT C SPOK LANG