Pre-Training on Mixed Data for Low-Resource Neural Machine Translation

被引:6
作者
Zhang, Wenbo [1 ,2 ,3 ]
Li, Xiao [1 ,2 ,3 ]
Yang, Yating [1 ,2 ,3 ]
Dong, Rui [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China
关键词
neural machine translation; pre-training; low resource; word translation;
D O I
10.3390/info12030133
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The pre-training fine-tuning mode has been shown to be effective for low resource neural machine translation. In this mode, pre-training models trained on monolingual data are used to initiate translation models to transfer knowledge from monolingual data into translation models. In recent years, pre-training models usually take sentences with randomly masked words as input, and are trained by predicting these masked words based on unmasked words. In this paper, we propose a new pre-training method that still predicts masked words, but randomly replaces some of the unmasked words in the input with their translation words in another language. The translation words are from bilingual data, so that the data for pre-training contains both monolingual data and bilingual data. We conduct experiments on Uyghur-Chinese corpus to evaluate our method. The experimental results show that our method can make the pre-training model have a better generalization ability and help the translation model to achieve better performance. Through a word translation task, we also demonstrate that our method enables the embedding of the translation model to acquire more alignment knowledge.
引用
收藏
页数:10
相关论文
共 50 条
[21]   STA: An efficient data augmentation method for low-resource neural machine translation [J].
Li, Fuxue ;
Chi, Chuncheng ;
Yan, Hong ;
Liu, Beibei ;
Shao, Mingzhi .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) :121-132
[22]   A Content Word Augmentation Method for Low-Resource Neural Machine Translation [J].
Li, Fuxue ;
Zhao, Zhongchao ;
Chi, Chuncheng ;
Yan, Hong ;
Zhang, Zhen .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 :720-731
[23]   An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages [J].
Mueller, Aaron ;
Nicolai, Garrett ;
McCarthy, Arya D. ;
Lewis, Dylan ;
Wu, Winston ;
Yarowsky, David .
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, :3710-3718
[24]   Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism [J].
Yu, Zhiqiang ;
Yu, Zhengtao ;
Guo, Junjun ;
Huang, Yuxin ;
Wen, Yonghua .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)
[25]   Pre-training neural machine translation with alignment information via optimal transport [J].
Su, Xueping ;
Zhao, Xingkai ;
Ren, Jie ;
Li, Yunhong ;
Raetsch, Matthias .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) :48377-48397
[26]   Pre-training neural machine translation with alignment information via optimal transport [J].
Xueping Su ;
Xingkai Zhao ;
Jie Ren ;
Yunhong Li ;
Matthias Rätsch .
Multimedia Tools and Applications, 2024, 83 :48377-48397
[27]   Low-Resource Named Entity Recognition via the Pre-Training Model [J].
Chen, Siqi ;
Pei, Yijie ;
Ke, Zunwang ;
Silamu, Wushour .
SYMMETRY-BASEL, 2021, 13 (05)
[28]   Low-Resource Neural Machine Translation: A Systematic Literature Review [J].
Yazar, Bilge Kagan ;
Sahin, Durmus Ozkan ;
Kilic, Erdal .
IEEE ACCESS, 2023, 11 :131775-131813
[29]   A Data Augmentation Method Based on Sub-tree Exchange for Low-Resource Neural Machine Translation [J].
Chi, Chuncheng ;
Li, Fuxue ;
Yan, Hong ;
Guan, Hui ;
Zhao, Zhongchao .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 :646-657
[30]   DRA: dynamic routing attention for neural machine translation with low-resource languages [J].
Wang, Zhenhan ;
Song, Ran ;
Yu, Zhengtao ;
Mao, Cunli ;
Gao, Shengxiang .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,