Low-Resource Neural Machine Translation with Neural Episodic Control

被引:0
作者
Wu, Nier [1 ]
Hou, Hongxu [1 ]
Sun, Shuo [1 ]
Zheng, Wei [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, Coll Software, Hohhot, Peoples R China
来源
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年
关键词
Reinforcement Learning; Machine Translation; DND; Episodic Control; Low-resource;
D O I
10.1109/IJCNN52387.2021.9533677
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) has been proved to alleviate metric inconsistency and exposure deviation in training-evaluation of neural machine translation (NMT), but the sample efficiency is limited by sampling methods (Temporal-Difference (TD) or Monte-Carlo (MC)), and still cannot compensate for the inefficient non-zero rewards caused by insufficient data sets. In addition, RL rewards can only be effective when the model parameters are basically determined. Therefore, we proposed episodic control reinforcement learning method, which obtains the model with basically determined parameters through the knowledge transfer, and records the historical action trajectory by introducing semi-tabular differentiable neural dictionary (DND), the model can quickly approximate the real state-value according to samples reward when updating policy. We verified on CCMT2019 Mongolian-Chinese (Mo-Zh), Tibetan-Chinese (Ti-Zh), and Uyghur-Chinese (Ug-Zh) tasks, and the results showed that the quality was significantly improved, which fully demonstrated the effectiveness of the method.
引用
收藏
页数:7
相关论文
共 16 条
  • [1] [Anonymous], 2017, CORR
  • [2] Bahdanau Dzmitry., 2018, CoRR
  • [3] Bahdanau Dzmitry, 2015, 3RD INTERNATIONAL CO
  • [4] Finn C, 2017, PR MACH LEARN RES, V70
  • [5] A Convolutional Encoder Model for Neural Machine Translation
    Gehring, Jonas
    Auli, Michael
    Grangier, David
    Dauphin, Yann N.
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 123 - 135
  • [6] Gu JT, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P3622
  • [7] Kalchbrenner N., 2013, P 2013 C EMPIRICAL M
  • [8] Analysis of a six-axis industrial robot's dynamic path accuracy based on an optical tracker
    Lin, Zhirong
    Dai, Houde
    Wu, Zhouxin
    Zeng, Yadan
    Su, Shijian
    Xia, Xuke
    Lin, Mingqiang
    Yu, Patrick Hung-Hsiu
    [J]. 2017 5TH INTERNATIONAL CONFERENCE ON ENTERPRISE SYSTEMS (ES), 2017, : 178 - 184
  • [9] BLEU: a method for automatic evaluation of machine translation
    Papineni, K
    Roukos, S
    Ward, T
    Zhu, WJ
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
  • [10] Sharkey N. E., 1995, Connection Science, V7, P301, DOI 10.1080/09540099550039264