Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation

被引:0
作者
Chen, Guanhua [1 ]
Chen, Yun [2 ]
Wang, Yong [1 ]
Li, Victor O. K. [1 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
来源
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging lexical constraint is extremely significant in domain-specific machine translation and interactive machine translation. Previous studies mainly focus on extending beam search algorithm or augmenting the training corpus by replacing source phrases with the corresponding target translation. These methods either suffer from the heavy computation cost during inference or depend on the quality of the bilingual dictionary pre-specified by the user or constructed with statistical machine translation. In response to these problems, we present a conceptually simple and empirically effective data augmentation approach in lexical constrained neural machine translation. Specifically, we construct constraint-aware training data by first randomly sampling the phrases of the reference as constraints, and then packing them together into the source sentence with a separation symbol. Extensive experiments on several language pairs demonstrate that our approach achieves superior translation results over the existing systems, improving translation of constrained sentences without hurting the unconstrained ones.
引用
收藏
页码:3587 / 3593
页数:7
相关论文
共 23 条
[1]  
Arthur Philip, 2016, P 2016 C EMP METH NA, P1557, DOI DOI 10.18653/V1/D16-1162
[2]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[3]  
Cheng Shanbo, 2016, P 2016 C N AM CHAPTE, P1240
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Dinu G, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P3063
[6]  
Dou Zi-Yi, 2019, P 2019 C N 2019, DOI DOI 10.18653/V1/N19-4007
[7]  
Dyer C., 2013, North American Chapter of the Association for Computational Linguistics
[8]  
Gui Y.-Y., 2016, NEWZOO, V3, P522
[9]  
Hasler E., 2018, P 2018 C N AM CHAPT, V2, P506
[10]   Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search [J].
Hokamp, Chris ;
Liu, Qun .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, :1535-1546