Multilingual Pre-training Model-Assisted Contrastive Learning Neural Machine Translation

被引：0

作者：

Sun, Shuo ^{[1
]}

Hou, Hong-xu ^{[1
]}

Yang, Zong-heng ^{[1
]}

Wang, Yi-song ^{[1
]}

机构：

[1] Inner Mongolia Univ, Coll Comp Sci, Natl & Local Joint Engn Res Ctr Intelligent Infor, Inner Mongolia Key Lab Mongolian Informat Proc Te, Hohhot, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

Low-Resource NMT; Pre-training Model; Contrastive Learning; Dynamic Training;

D O I：

10.1109/IJCNN54540.2023.10191766

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since pre-training and fine-tuning have been a successful paradigm in Natural Language Processing (NLP), this paper adopts the SOTA pre-training model-CeMAT as a strong assistant for low-resource ethnic language translation tasks. Aiming at the exposure bias problem in the fine-tuning process, we use the contrastive learning framework and propose a new contrastive examples generation method, which uses self-generated predictions as contrastive examples to expose the model to errors during inference. Moreover, in order to effectively utilize the limited bilingual data in low-resource tasks, this paper proposes a dynamic training strategy to fine-tune the model, and refines the model step by step by taking word embedding norm and uncertainty as the criteria of evaluate data and model respectively. Experimental results demonstrate that our method significantly improves the quality compared to the baselines, which fully verifies the effectiveness.

引用

页数：7

共 50 条

[31] Flight parameter prediction for high-dynamic Hypersonic vehicle system based on pre-training machine learning model
Zhou, Dengji
Huang, Dawen
Zhang, Xing
Tie, Ming
Wang, Yulin
Shen, Yaoxin
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (11) : 1041 - 1054
[32] Continual Pre-Training of Language Models for Concept Prerequisite Learning with Graph Neural Networks
Tang, Xin
Liu, Kunjia
Xu, Hao
Xiao, Weidong
Tan, Zhen
MATHEMATICS, 2023, 11 (12)
[33] Contrastive Adversarial Training for Multi-Modal Machine Translation
Huang, Xin
Zhang, Jiajun
Zong, Chengqing
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
[34] Deep learning prediction of radiation-induced xerostomia with supervised contrastive pre-training and cluster-guided loss
Wan, Bohua
McNutt, Todd
Ger, Rachel
Quon, Harry
Lee, Junghoon
COMPUTER-AIDED DIAGNOSIS, MEDICAL IMAGING 2024, 2024, 12927
[35] Contrastive Ground-Level Image and Remote Sensing Pre-training Improves Representation Learning for Natural World Imagery
Huynh, Andy, V
Gillespie, Lauren E.
Lopez-Saucedo, Jael
Tang, Claire
Sikand, Rohan
Exposito-Alonso, Moises
COMPUTER VISION - ECCV 2024, PT LXXX, 2025, 15138 : 173 - 190
[36] Contrastive pre-training of Soft-Clustering GCN for diagnosing Alzheimer's disease
Ge, Sihui
Yang, Zhi
Gan, Haitao
Huang, Zhongwei
Zhou, Ran
Wang, Ji
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[37] Pre-training Graph Neural Network for Cross Domain Recommendation
Wang, Chen
Liang, Yueqing
Liu, Zhiwei
Zhang, Tao
Yu, Philip S.
2021 IEEE THIRD INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2021), 2021, : 140 - 145
[38] Improving BERTScore for Machine Translation Evaluation Through Contrastive Learning
Tang, Gongbo
Yousuf, Oreen
Jin, Zeying
IEEE ACCESS, 2024, 12 : 77739 - 77749
[39] Simultaneously Training and Compressing Vision-and-Language Pre-Training Model
Qi, Qiaosong
Zhang, Aixi
Liao, Yue
Sun, Wenyu
Wang, Yongliang
Li, Xiaobo
Liu, Si
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8194 - 8203
[40] Multi-modal graph contrastive encoding for neural machine translation
Yin, Yongjing
Zeng, Jiali
Su, Jinsong
Zhou, Chulun
Meng, Fandong
Zhou, Jie
Huang, Degen
Luo, Jiebo
ARTIFICIAL INTELLIGENCE, 2023, 323

← 1 2 3 4 5 →