Fine-grained attention mechanism for neural machine translation

被引:131
作者
Choi, Heeyoul [1 ]
Cho, Kyunghyun [2 ]
Bengio, Yoshua [3 ]
机构
[1] Handong Global Univ, Pohang, South Korea
[2] NYU, Comp Sci & Data Sci, New York, NY USA
[3] Univ Montreal, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会; 新加坡国家研究基金会;
关键词
Neural machine translation; Attention mechanism; Fine-grained attention;
D O I
10.1016/j.neucom.2018.01.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural machine translation (NMT) has been a new paradigm in machine translation, and the attention mechanism has become the dominant approach with the state-of-the-art records in many language pairs. While there are variants of the attention mechanism, all of them use only temporal attention where one scalar value is assigned to one context vector corresponding to a source word. In this paper, we propose a fine-grained (or 2D) attention mechanism where each dimension of a context vector will receive a separate attention score. In experiments with the task of En-De and En-Fi translation, the fine-grained attention method improves the translation quality in terms of BLEU score. In addition, our alignment analysis reveals how the fine-grained attention mechanism exploits the internal structure of context vectors. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 24 条
  • [1] [Anonymous], 2013, P 2013 C EMPIRICAL M
  • [2] [Anonymous], NIPS DEEP LEARN WORK
  • [3] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [4] Cho K., 2014, ARXIV, DOI 10.3115/v1/w14-4012
  • [5] Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
    Cho, Kyunghyun
    Courville, Aaron
    Bengio, Yoshua
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1875 - 1886
  • [6] Context-dependent word representation for neural machine translation
    Choi, Heeyoul
    Cho, Kyunghyun
    Bengio, Yoshua
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 149 - 160
  • [7] Chung J., 2016, P 1 C STAT MACH TRAN
  • [8] Chung J., 2016, P 54 ANN M ASS COMP
  • [9] Cohn T, 2016, P NAACL HLT
  • [10] Costa-jussà MR, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, P357