Parallel Attention Mechanisms in Neural Machine Translation

被引:11
|
作者
Medina, Julian Richard [1 ]
Kalita, Jugal [1 ]
机构
[1] Univ Colorado, Comp Sci, Colorado Springs, CO 80907 USA
来源
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2018年
基金
美国国家科学基金会;
关键词
machine translation; transformer; attention;
D O I
10.1109/ICMLA.2018.00088
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous standards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attention-focused architectures in parallel, that even more sequential operations can be removed from the model and thereby decrease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with parallel ones, reducing the amount of training time and substantially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art.
引用
收藏
页码:547 / 552
页数:6
相关论文
共 50 条
  • [1] Recurrent Attention for Neural Machine Translation
    Zeng, Jiali
    Wu, Shuangzhi
    Yin, Yongjing
    Jiang, Yufan
    Li, Mu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
  • [2] Neural Machine Translation with Deep Attention
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
  • [3] Attention-via-Attention Neural Machine Translation
    Zhao, Shenjian
    Zhang, Zhihua
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
  • [4] Sparse and Constrained Attention for Neural Machine Translation
    Malaviya, Chaitanya
    Ferreira, Pedro
    Martins, Andre F. T.
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 370 - 376
  • [5] Bilingual attention based neural machine translation
    Kang, Liyan
    He, Shaojie
    Wang, Mingxuan
    Long, Fei
    Su, Jinsong
    APPLIED INTELLIGENCE, 2023, 53 (04) : 4302 - 4315
  • [6] Bilingual attention based neural machine translation
    Liyan Kang
    Shaojie He
    Mingxuan Wang
    Fei Long
    Jinsong Su
    Applied Intelligence, 2023, 53 : 4302 - 4315
  • [7] Chinese-Catalan: A Neural Machine Translation Approach Based on Pivoting and Attention Mechanisms
    Costa-Jussa, Marta R.
    Casas, Noe
    Escolano, Carlos
    Fonollosa, Jose A. R.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [8] Attention Calibration for Transformer in Neural Machine Translation
    Lu, Yu
    Zeng, Jiali
    Zhang, Jiajun
    Wu, Shuangzhi
    Li, Mu
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1288 - 1298
  • [9] Attention With Sparsity Regularization for Neural Machine Translation and Summarization
    Zhang, Jiajun
    Zhao, Yang
    Li, Haoran
    Zong, Chengqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 507 - 518
  • [10] Neural Machine Translation with Target-Attention Model
    Yang, Mingming
    Zhang, Min
    Chen, Kehai
    Wang, Rui
    Zhao, Tiejun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (03) : 684 - 694