Continual Learning with Confidence-based Multi-teacher Knowledge Distillation for Neural Machine Translation

被引：0

作者：

Guo, Jiahua ^{[1
]}

Liang, Yunlong ^{[1
]}

Xu, Jinan ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Beijing, Peoples R China

来源：

2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024 | 2024年

基金：

国家重点研发计划;

关键词：

neural machine translation; continual learning; knowledge distillation;

D O I：

10.1109/ICNLP60986.2024.10692378

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Continual learning is widely used in practical applications of neural machine translation, which aims to not only achieve good performance on new domains but also preserve the knowledge of previously learned domains. However, the existing methods for continual learning usually suffer from catastrophic forgetting in the multi-domain continuous learning scenario. When the model is trained on multiple diverse domains one by one, the performance on the previous domains will decrease drastically. In this work, we propose a multi-teacher knowledge distillation technique to alleviate the problem of catastrophic forgetting systematically. Firstly, we adopted the multi-teacher knowledge method, where models from previous training stages are used as teachers. Secondly, to further effectively enhance the performance of previous domains, we propose a confidence-based integration mechanism in which multiple teachers are integrated with sample-adaptive weights based on their performance. We conduct experiments in the settings of multi-domain continual learning, where the pre-trained model is transferred to five diverse domains (IT, Law, Medical, Subtitles, Koran) sequentially. Experimental results show that the proposed method achieves superior performance compared to several strong baseline methods.

引用

页码：336 / 343

页数：8

共 50 条

[31] Named Entity Recognition Method Based on Multi-Teacher Collaborative Cyclical Knowledge Distillation
Jin, Chunqiao
Yang, Shuangyuan
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 230 - 235
[32] MTKDSR: Multi-Teacher Knowledge Distillation for Super Resolution Image Reconstruction
Yao, Gengqi
Li, Zhan
Bhanu, Bir
Kang, Zhiqing
Zhong, Ziyi
Zhang, Qingfeng
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 352 - 358
[33] Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation
Cao, Shengcao
Li, Mengtian
Hays, James
Ramanan, Deva
Wang, Yu-Xiong
Gui, Liang-Yan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[34] MulDE: Multi-teacher Knowledge Distillation for Low-dimensional Knowledge Graph Embeddings
Wang, Kai
Liu, Yu
Ma, Qian
Sheng, Quan Z.
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1716 - 1726
[35] Accurate and efficient protein embedding using multi-teacher distillation learning
Shang, Jiayu
Peng, Cheng
Ji, Yongxin
Guan, Jiaojiao
Cai, Dehan
Tang, Xubo
Sun, Yanni
BIOINFORMATICS, 2024, 40 (09)
[36] Cross-View Gait Recognition Method Based on Multi-Teacher Joint Knowledge Distillation
Li, Ruoyu
Yun, Lijun
Zhang, Mingxuan
Yang, Yanchen
Cheng, Feiyan
SENSORS, 2023, 23 (22)
[37] Dual Knowledge Distillation for neural machine translation
Wan, Yuxian
Zhang, Wenlin
Li, Zhen
Zhang, Hao
Li, Yanxia
COMPUTER SPEECH AND LANGUAGE, 2024, 84
[38] Continual Learning Based on Knowledge Distillation and Representation Learning
Chen, Xiu-Yan
Liu, Jian-Wei
Li, Wen-Tao
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 27 - 38
[39] Learning Semantic Textual Similarity via Multi-Teacher Knowledge Distillation: A Multiple Data Augmentation method
Lu, Zhikun
Zhao, Ying
Li, Jinnan
Tian, Yuan
2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 1197 - 1203
[40] CIMTD: Class Incremental Multi-Teacher Knowledge Distillation for Fractal Object Detection
Wu, Chuhan
Luo, Xiaochuan
Huang, Haoran
Zhang, Yulin
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 51 - 65

← 1 2 3 4 5 →