Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引:0
作者
Skurzhanskyi, O. H. [1 ]
Marchenko, O. O. [1 ]
Anisimov, A. V. [1 ]
机构
[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine
关键词
artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;
D O I
10.1007/s10559-024-00658-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.
引用
收藏
页码:167 / 174
页数:8
相关论文
共 27 条
[1]  
Fabre B, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P2100
[2]  
Fu Y., 2020, ADV NEURAL INFORM PR, V25, DOI 10.48550
[3]  
Fu Y., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2001.01941
[4]  
Gehring J, 2017, PR MACH LEARN RES, V70
[5]  
Goyal T, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P238
[6]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[7]   Pre-trained models: Past, present and future [J].
Han, Xu ;
Zhang, Zhengyan ;
Ding, Ning ;
Gu, Yuxian ;
Liu, Xiao ;
Huo, Yuqi ;
Qiu, Jiezhong ;
Yao, Yuan ;
Zhang, Ao ;
Zhang, Liang ;
Han, Wentao ;
Huang, Minlie ;
Jin, Qin ;
Lan, Yanyan ;
Liu, Yang ;
Liu, Zhiyuan ;
Lu, Zhiwu ;
Qiu, Xipeng ;
Song, Ruihua ;
Tang, Jie ;
Wen, Ji-Rong ;
Yuan, Jinhui ;
Zhao, Wayne Xin ;
Zhu, Jun .
AI OPEN, 2021, 2 :225-250
[8]  
Hosking T, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P1405
[9]  
Kasai J., 2021, ARXIV, DOI DOI 10.48550/ARXIV.2006.10369
[10]  
Krishna K, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P737