Pre-training neural machine translation with alignment information via optimal transport

被引：0

作者：

Su, Xueping ^{[1
]}

Zhao, Xingkai ^{[1
]}

Ren, Jie ^{[1
]}

Li, Yunhong ^{[1
]}

Raetsch, Matthias ^{[2
]}

机构：

[1] Xian Polytech Univ, Sch Elect & Informat, Xian, Peoples R China

[2] Reutlingen Univ, Dept Engn, Interact & Mobile Robot & Artificial Intelligence, Reutlingen, Germany

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 83卷 / 16期

基金：

中国国家自然科学基金;

关键词：

Optimal Transport; Alignment Information; Pre-training; Neural Machine Translation;

D O I：

10.1007/s11042-023-17479-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid development of globalization, the demand for translation between different languages is also increasing. Although pre-training has achieved excellent results in neural machine translation, the existing neural machine translation has almost no high-quality suitable for specific fields. Alignment information, so this paper proposes a pre-training neural machine translation with alignment information via optimal transport. First, this paper narrows the representation gap between different languages by using OTAP to generate domain-specific data for information alignment, and learns richer semantic information. Secondly, this paper proposes a lightweight model DR-Reformer, which uses Reformer as the backbone network, adds Dropout layers and Reduction layers, reduces model parameters without losing accuracy, and improves computational efficiency. Experiments on the Chinese and English datasets of AI Challenger 2018 and WMT-17 show that the proposed algorithm has better performance than existing algorithms.

引用

页码：48377 / 48397

页数：21

共 50 条

[31] Finding a good initial configuration of parameters for restricted Boltzmann machine pre-training
Xie, Chunzhi
Lv, Jiancheng
Li, Xiaojie
SOFT COMPUTING, 2017, 21 (21) : 6471 - 6479
[32] Code-aware fault localization with pre-training and interpretable machine learning
Zhang, Zhuo
Li, Ya
Yang, Sha
Zhang, Zhanjun
Lei, Yan
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[33] SAR2NDVI: PRE-TRAINING FOR SAR-TO-NDVI IMAGE TRANSLATION
Kimura, Daiki
Ishikawa, Tatsuya
Mitsugi, Masanori
Kitakoshi, Yasunori
Tanaka, Takahiro
Simumba, Naomi
Tanaka, Kentaro
Wakabayashi, Hiroaki
Sampei, Masato
Tatsubori, Michiaki
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3865 - 3869
[34] Finding a good initial configuration of parameters for restricted Boltzmann machine pre-training
Chunzhi Xie
Jiancheng Lv
Xiaojie Li
Soft Computing, 2017, 21 : 6471 - 6479
[35] Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
Skurzhanskyi, O. H.
Marchenko, O. O.
Anisimov, A. V.
CYBERNETICS AND SYSTEMS ANALYSIS, 2024, 60 (02) : 167 - 174
[36] The Reduction of Fully Connected Neural Network Parameters Using the Pre-training Technique
Kroshchanka, Aliaksandr
Golovko, Vladimir
PROCEEDINGS OF THE 11TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS (IDAACS'2021), VOL 2, 2021, : 937 - 941
[37] Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
O. H. Skurzhanskyi
O. O. Marchenko
A. V. Anisimov
Cybernetics and Systems Analysis, 2024, 60 : 167 - 174
[38] GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks
Sun, Mingchen
Zhou, Kaixiong
He, Xin
Wang, Ying
Wang, Xin
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1717 - 1727
[39] Neural Machine Translation by Fusing Key Information of Text
Hu, Shijie
Li, Xiaoyu
Bai, Jiayu
Lei, Hang
Qian, Weizhong
Hu, Sunqiang
Zhang, Cong
Kofi, Akpatsa Samuel
Qiu, Qian
Zhou, Yong
Yang, Shan
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 2803 - 2815
[40] RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training
Qiao, Ziyue
Fu, Yanjie
Wang, Pengyang
Xiao, Meng
Ning, Zhiyuan
Zhang, Denghui
Du, Yi
Zhou, Yuanchun
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (01) : 186 - 199

← 1 2 3 4 5 →