Document-Level Neural Machine Translation with Associated Memory Network

被引:0
作者
Jiang, Shu [1 ,2 ]
Wang, Rui [1 ,2 ]
Li, Zuchao [1 ,2 ]
Utiyama, Masao [3 ]
Chen, Kehai [3 ]
Sumita, Eiichiro [3 ]
Zhao, Hai [1 ,2 ]
Lu, Bao-liang [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai 200240, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
基金
中国国家自然科学基金;
关键词
memory network; neural machine translation; document-level context;
D O I
10.1587/transinf.2020EDP7244
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Standard neural machine translation (NMT) is on the assumption that the document-level context is independent. Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network. The capacity of the memory network that detecting the most relevant part of the current sentence from memory renders a natural solution to model the rich document-level context. In this work, the proposed document-aware memory network is implemented to enhance the Transformer NMT baseline. Experiments on several tasks show that the proposed method significantly improves the NMT performance over strong Transformer baselines and other related studies.
引用
收藏
页码:1712 / 1723
页数:12
相关论文
共 42 条
[1]  
Aha David W., 2013, Lazy Learning
[2]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[3]  
Bojar OndIej, 2013, Proceedings of the Eighth Workshop on Statistical Machine Translation, P1, DOI DOI 10.3115/V1/W14-3302
[4]  
Chen JX, 2020, WORKSHOP ON AUTOMATIC SIMULTANEOUS TRANSLATION CHALLENGES, RECENT ADVANCES, AND FUTURE DIRECTIONS, P30
[5]  
Cho K., 2014, P C EMP METH NAT LAN, P1724, DOI DOI 10.3115/V1/D14-1179
[6]   Introduction to the special issue on memory-based language processing [J].
Daelemans, W .
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1999, 11 (03) :287-296
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Fix, 1951, DISCRIMINATORY ANAL
[9]  
Geoffrey EHinton., 2012, Improving neural networks by preventing co-adaptation of feature detectors
[10]  
Guan C, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P3361