Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [31] Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization
    Ponti, Edoardo M.
    Vulic, Ivan
    Glavas, Goran
    Mrksic, Nikola
    Korhonen, Anna
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 282 - 293
  • [32] The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer
    Efimov, Pavel
    Boytsov, Leonid
    Arslanova, Elena
    Braslavski, Pavel
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 51 - 67
  • [33] Zero-Shot Cross-Lingual Knowledge Transfer in VQA via Multimodal Distillation
    Weng, Yu
    Dong, Jun
    He, Wenbin
    Chaomurilige
    Liu, Xuan
    Liu, Zheng
    Gao, Honghao
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 1 - 11
  • [34] Zero-shot cross-lingual transfer language selection using linguistic similarity
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [35] Transfer language selection for zero-shot cross-lingual abusive language detection
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    Arata, Masaki
    Leliwa, Gniewosz
    Wroczynski, Michal
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [36] Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training
    Huang, Kuan-Hao
    Ahmad, Wasi Uddin
    Peng, Nanyun
    Chang, Kai-Wei
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1684 - 1697
  • [37] Pruning Residual Networks in Multilingual Neural Machine Translation to Improve Zero-Shot Translation
    Lu, Kaiwen
    Yang, Yating
    Dong, Rui
    Ma, Bo
    Wang, Lei
    Zhou, Xi
    Ahmat, Ahtamjan
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 280 - 292
  • [38] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [39] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183
  • [40] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184