Parallel Attention Mechanisms in Neural Machine Translation

被引：11

作者：

Medina, Julian Richard ^{[1
]}

Kalita, Jugal ^{[1
]}

机构：

[1] Univ Colorado, Comp Sci, Colorado Springs, CO 80907 USA

来源：

2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2018年

基金：

美国国家科学基金会;

关键词：

machine translation; transformer; attention;

D O I：

10.1109/ICMLA.2018.00088

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous standards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attention-focused architectures in parallel, that even more sequential operations can be removed from the model and thereby decrease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with parallel ones, reducing the amount of training time and substantially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art.

引用

页码：547 / 552

页数：6

共 50 条

[1] Recurrent Attention for Neural Machine Translation
Zeng, Jiali
Wu, Shuangzhi
Yin, Yongjing
Jiang, Yufan
Li, Mu
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
[2] Neural Machine Translation with Deep Attention
Zhang, Biao
Xiong, Deyi
Su, Jinsong
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
[3] Attention-via-Attention Neural Machine Translation
Zhao, Shenjian
Zhang, Zhihua
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
[4] Sparse and Constrained Attention for Neural Machine Translation
Malaviya, Chaitanya
Ferreira, Pedro
Martins, Andre F. T.
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 370 - 376
[5] Bilingual attention based neural machine translation
Kang, Liyan
He, Shaojie
Wang, Mingxuan
Long, Fei
Su, Jinsong
APPLIED INTELLIGENCE, 2023, 53 (04) : 4302 - 4315
[6] Bilingual attention based neural machine translation
Liyan Kang
Shaojie He
Mingxuan Wang
Fei Long
Jinsong Su
Applied Intelligence, 2023, 53 : 4302 - 4315
[7] Chinese-Catalan: A Neural Machine Translation Approach Based on Pivoting and Attention Mechanisms
Costa-Jussa, Marta R.
Casas, Noe
Escolano, Carlos
Fonollosa, Jose A. R.
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
[8] Attention Calibration for Transformer in Neural Machine Translation
Lu, Yu
Zeng, Jiali
Zhang, Jiajun
Wu, Shuangzhi
Li, Mu
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1288 - 1298
[9] Attention With Sparsity Regularization for Neural Machine Translation and Summarization
Zhang, Jiajun
Zhao, Yang
Li, Haoran
Zong, Chengqing
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 507 - 518
[10] Neural Machine Translation with Target-Attention Model
Yang, Mingming
Zhang, Min
Chen, Kehai
Wang, Rui
Zhao, Tiejun
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (03) : 684 - 694

← 1 2 3 4 5 →