Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引：0

作者：

Skurzhanskyi, O. H. ^{[1
]}

Marchenko, O. O. ^{[1
]}

Anisimov, A. V. ^{[1
]}

机构：

[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine

来源：

CYBERNETICS AND SYSTEMS ANALYSIS | 2024年 / 60卷 / 02期

关键词：

artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;

D O I：

10.1007/s10559-024-00658-7

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.

引用

页码：167 / 174

页数：8

共 27 条

[1]

Fabre B, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P2100

[2]

Fu Y., 2020, ADV NEURAL INFORM PR, V25, DOI 10.48550

[3]

Fu Y., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2001.01941

[4]

Gehring J, 2017, PR MACH LEARN RES, V70

[5]

Goyal T, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P238

[6]

Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]

[7] Pre-trained models: Past, present and future [J].

Han, Xu ;

Zhang, Zhengyan ;

Ding, Ning ;

Gu, Yuxian ;

Liu, Xiao ;

Huo, Yuqi ;

Qiu, Jiezhong ;

Yao, Yuan ;

Zhang, Ao ;

Zhang, Liang ;

Han, Wentao ;

Huang, Minlie ;

Jin, Qin ;

Lan, Yanyan ;

Liu, Yang ;

Liu, Zhiyuan ;

Lu, Zhiwu ;

Qiu, Xipeng ;

Song, Ruihua ;

Tang, Jie ;

Wen, Ji-Rong ;

Yuan, Jinhui ;

Zhao, Wayne Xin ;

Zhu, Jun .

AI OPEN, 2021, 2 :225-250

[8]

Hosking T, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P1405

[9]

Kasai J., 2021, ARXIV, DOI DOI 10.48550/ARXIV.2006.10369

[10]

Krishna K, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P737

← 1 2 3 →