Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

被引:0
作者
Guo, Yiduo [1 ]
Liang, Yaobo [2 ]
Zhao, Dongyan [1 ,4 ,5 ]
Liu, Bing [3 ]
Nan, Duan [2 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Univ Illinois, Dept Comp Sci, Chicago, IL USA
[4] Natl Key Lab Gen Artificial Intelligence, Beijing, Peoples R China
[5] BIGAI, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing research has shown that a multilingual pre-trained language model fine-tuned with one (source) language also performs well on downstream tasks for non-source languages, even though no fine-tuning is done on these languages. However, there is a clear gap between the performance of the source language and that of the non-source languages. This paper analyzes the fine-tuning process, discovers when the performance gap changes and identifies which network weights affect the overall performance most. Additionally, the paper seeks to answer to what extent the gap can be reduced by reducing forgetting. Based on the analysis results, a method named Fine-tuning slow and fast with four training policies is proposed to address these issues. Experimental results show the proposed method outperforms baselines by a clear margin.
引用
收藏
页码:4002 / 4017
页数:16
相关论文
共 31 条
[1]  
Buzzega P., 2020, Advances in Neural Information Processing Systems, V33, P15920
[2]  
Chen S, 2020, ARXIV200412651, P7870
[3]  
Chi Zewen, 2020, ARXIV200707834
[4]  
Conneau A, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2475
[5]  
Conneau Alexis, 2019, Unsupervised cross-lingual representation learning at scale, DOI DOI 10.18653/V1/2020.ACL-MAIN.747
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Fang YW, 2021, AAAI CONF ARTIF INTE, V35, P12776
[8]  
He Ruidan, 2021, ARXIV210603164
[9]  
Hu JJ, 2020, PR MACH LEARN RES, V119
[10]  
Jiang Lan, 2022, ARXIV221009658