Low-Resource Neural Machine Translation Using XLNet Pre-training Model

被引:3
作者
Wu, Nier [1 ]
Hou, Hongxu [1 ]
Guo, Ziyue [1 ]
Zheng, Wei [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, Coll Software, Hohhot, Inner Mongolia, Peoples R China
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V | 2021年 / 12895卷
关键词
Low-resource; Machine translation; XLNet; Pre-training;
D O I
10.1007/978-3-030-86383-8_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The methods to improve the quality of low-resource neural machine translation (NMT) include: change the token granularity to reduce the number of low-frequency words; generate pseudo-parallel corpus from large-scale monolingual data to optimize model parameters; Use the auxiliary knowledge of pre-trained model to train NMT model. However, reducing token granularity will result in a large number of invalid operations and increase the complexity of local reordering on the target side. Pseudo-parallel corpus contains noise affect model convergence. Pre-training methods also limit translation quality due to the human error and the assumption of conditional independence. Therefore, we proposed a XLNet based pre-training method, that corrects the defects of the pre-training model, and enhance NMT model for context feature extraction. Experiments are carried out on CCMT2019 Mongolian-Chinese (Mo-Zh), Uyghur-Chinese (Ug-Zh) and Tibetan-Chinese (Ti-Zh) tasks, the results show that the generalization ability and BLEU scores of our method are improved compared with the baseline, which fully verifies the effectiveness of the method.
引用
收藏
页码:503 / 514
页数:12
相关论文
共 11 条
  • [1] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [2] Chen Mingda, 2019, P 23 C COMP NAT LANG
  • [3] Emerging Trends Word2Vec
    Church, Kenneth Ward
    [J]. NATURAL LANGUAGE ENGINEERING, 2017, 23 (01) : 155 - 162
  • [4] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [5] Lewis M., 2020, P 58 ANN M ASS COMP, DOI DOI 10.18653/V1/2020.ACLMAIN.703
  • [6] Pennington J., 2014, P 2014 C EMP METH NA, P1532, DOI 10.3115/v1/D14-1162
  • [7] Radford A., 2018, OPENAI BLOG
  • [8] Song KT, 2019, PR MACH LEARN RES, V97
  • [9] Vaswani A, 2017, ADV NEUR IN, V30
  • [10] Weng RX, 2020, AAAI CONF ARTIF INTE, V34, P9266