Self-Attention with Cross-Lingual Position Representation

被引:0
作者
Ding, Liang [1 ]
Wang, Longyue [2 ]
Tao, Dacheng [1 ]
机构
[1] Univ Sydney, Fac Engn, UBTECH Sydney AI Ctr, Sch Comp Sci, Sydney, NSW, Australia
[2] Tencent AI Lab, Shenzhen, Peoples R China
来源
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020) | 2020年
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g., machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with cross-lingual position representations to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT' 14 English double right arrow German, WAT' 17 Japanese double right arrow English, and WMT' 17 Chinese double left right arrow English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information.
引用
收藏
页码:1679 / 1685
页数:7
相关论文
共 31 条
[1]  
[Anonymous], ACL
[2]  
[Anonymous], COMPUTATIONAL LINGUI
[3]  
[Anonymous], 2019, NAACL
[4]  
[Anonymous], 2017, 5 INT C LEARNING REP
[5]  
[Anonymous], 2018, NAACL
[6]  
[Anonymous], 2018, WMT
[7]  
[Anonymous], 2018, ICLR
[8]  
Chen Kehai, 2019, ACL
[9]  
Chen Kehai, 2019, EMNLP
[10]  
Cohn Trevor, 2016, NAACL