Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引:0
作者
Skurzhanskyi, O. H. [1 ]
Marchenko, O. O. [1 ]
Anisimov, A. V. [1 ]
机构
[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine
关键词
artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;
D O I
10.1007/s10559-024-00658-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.
引用
收藏
页码:167 / 174
页数:8
相关论文
共 27 条
[11]  
Lavie A., 2007, STATMT 07 P 2 WORKSH, P228, DOI [10.3115/1626355.1626389, DOI 10.3115/1626355.1626389]
[12]  
Lewis M., 2020, P 58 ANN M ASS COMPU, P7871
[13]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[14]  
Miao N, 2019, AAAI CONF ARTIF INTE, P6834
[15]  
Omelianchuk K., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2005.12592
[16]  
Ouyang L, 2022, ADV NEUR IN
[17]   BLEU: a method for automatic evaluation of machine translation [J].
Papineni, K ;
Roukos, S ;
Ward, T ;
Zhu, WJ .
40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, :311-318
[18]  
Pavlick E., 2015, SHORT PAPERS, V2
[19]  
Post M., 2018, P 3 C MACHINE TRANSL, P186, DOI [DOI 10.18653/V1/W18-6319, 10.18653/V1/W18-6319]
[20]  
Prakash A., 2016, ARXIV, DOI DOI 10.48550/ARXIV.1610.03098