End-to-end entity-aware neural machine translation

被引：16

作者：

Xie, Shufang ^{[1
]}

Xia, Yingce ^{[2
]}

Wu, Lijun ^{[2
]}

Huang, Yiqing ^{[3
]}

Fan, Yang ^{[4
]}

Qin, Tao ^{[2
]}

机构：

[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing 100872, Peoples R China

[2] Microsoft Res, Beijing 100080, Peoples R China

[3] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

[4] Univ Sci & Technol China, Sch Comp Sci, Hefei 230026, Anhui, Peoples R China

来源：

MACHINE LEARNING | 2022年 / 111卷 / 03期

关键词：

Machine translation; Named entity;

D O I：

10.1007/s10994-021-06073-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurate translation of entities (e.g., person names, organizations, geography) is important in neural machine translation (briefly, NMT), as they are usually more difficult to translate than other words, and an incorrect translation of them will greatly hurt user experiences. In previous works, entities are either treated in the same way as other words, which leads to inaccurate translation, or handled by multiple steps (including named entity recognition, translation, and replacing entities back), which significantly increase the inference latency. In this work, we propose an end-to-end algorithm that carefully handles the translation of entities. There are mainly two novel parts compared to conventional NMT model: (1) The encoder and the decoder are attached with entity classifiers, which are used to verify whether the input token is a named entity. In this way, the encoder and decoder are capable to treat named entities differently; (2) The translation loss of each target token is adaptively increased by the probability that the target token is a named entity, which results in more accurate translation of entities. During inference time, these two parts will be removed so that the translation model maintains the same inference speed as conventional NMT models. Empirical results on six translation tasks demonstrate the effectiveness of our methods of improving the translation quality. Specifically, we improve 1.7 BLEU scores on Japanese to English translation and 4.6 entity F-1 scores on English to Chinese translation, without additional inference cost.

引用

页码：1181 / 1203

页数：23

共 50 条

[1] End-to-end entity-aware neural machine translation
Shufang Xie
Yingce Xia
Lijun Wu
Yiqing Huang
Yang Fan
Tao Qin
Machine Learning, 2022, 111 : 1181 - 1203
[2] Multi-Head Attention for End-to-End Neural Machine Translation
Fung, Ivan
Mak, Brian
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 250 - 254
[3] Contextualized End-to-End Neural Entity Linking
Chen, Haotian
Zukov-Gregoric, Andrej
Li, Xi
Wadhwa, Sahil
1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 637 - 642
[4] An End-to-End Discriminative Approach to Machine Translation
Liang, Percy
Bouchard-Cote, Alexandre
Klein, Dan
Taskar, Ben
COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 761 - 768
[5] Research on Mongolian-Chinese machine translation based on the end-to-end neural network
Qing-Dao-Er-Ji, Ren
Su, Yila
Wu, Nier
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (01)
[6] End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Libovicky, Jindrich
Helcl, Jindrich
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3016 - 3021
[7] ASR-AWARE END-TO-END NEURAL DIARIZATION
Khare, Aparna
Han, Eunjung
Yang, Yuguang
Stolcke, Andreas
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8092 - 8096
[8] End-to-End Bootstrapping Neural Network for Entity Set Expansion
Yan, Lingyong
Han, Xianpei
He, Ben
Sun, Le
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9402 - 9409
[9] Tackling Contradiction Detection in German Using Machine Translation and End-to-End Recurrent Neural Networks
Pielka, Maren
Sifa, Rafet
Hillebrand, Lars Patrick
Biesner, David
Ramamurthy, Rajkumar
Ladi, Anna
Bauckhage, Christian
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6696 - 6701
[10] JEL: Applying End-to-End Neural Entity Linking in JPMorgan Chase
Ding, Wanying
Chaudhri, Vinay K.
Chittar, Naren
Konakanchi, Krishna
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15301 - 15308

← 1 2 3 4 5 →