Pre-training neural machine translation with alignment information via optimal transport

被引:0
作者
Su, Xueping [1 ]
Zhao, Xingkai [1 ]
Ren, Jie [1 ]
Li, Yunhong [1 ]
Raetsch, Matthias [2 ]
机构
[1] Xian Polytech Univ, Sch Elect & Informat, Xian, Peoples R China
[2] Reutlingen Univ, Dept Engn, Interact & Mobile Robot & Artificial Intelligence, Reutlingen, Germany
基金
中国国家自然科学基金;
关键词
Optimal Transport; Alignment Information; Pre-training; Neural Machine Translation;
D O I
10.1007/s11042-023-17479-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of globalization, the demand for translation between different languages is also increasing. Although pre-training has achieved excellent results in neural machine translation, the existing neural machine translation has almost no high-quality suitable for specific fields. Alignment information, so this paper proposes a pre-training neural machine translation with alignment information via optimal transport. First, this paper narrows the representation gap between different languages by using OTAP to generate domain-specific data for information alignment, and learns richer semantic information. Secondly, this paper proposes a lightweight model DR-Reformer, which uses Reformer as the backbone network, adds Dropout layers and Reduction layers, reduces model parameters without losing accuracy, and improves computational efficiency. Experiments on the Chinese and English datasets of AI Challenger 2018 and WMT-17 show that the proposed algorithm has better performance than existing algorithms.
引用
收藏
页码:48377 / 48397
页数:21
相关论文
共 34 条
[1]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473,1409.0473, DOI 10.48550/ARXIV.1409.0473,1409.0473]
[2]   Multilingual sequence to sequence convolutional machine translation [J].
Bansal, Mani ;
Lobiyal, D. K. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (25) :33701-33726
[3]   Integrating Prior Translation Knowledge Into Neural Machine Translation [J].
Chen, Kehai ;
Wang, Rui ;
Utiyama, Masao ;
Sumita, Eiichiro .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 :330-339
[4]  
Chen MX, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P76
[5]  
Chen YD, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P3881
[6]  
Cho K., 2014, P EMPIRICAL METHODS, P1724, DOI 10.48550/arXiv.1406.1078
[7]  
Chrisman L., 1991, Connection Science, V3, P345, DOI 10.1080/09540099108946592
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
Edunov S, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P489
[10]   Searching Better Architectures for Neural Machine Translation [J].
Fan, Yang ;
Tian, Fei ;
Xia, Yingce ;
Qin, Tao ;
Li, Xiang-Yang ;
Liu, Tie-Yan .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 :1574-1585