Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation

被引:0
作者
Skurzhanskyi, O. H. [1 ]
Marchenko, O. O. [1 ]
Anisimov, A. V. [1 ]
机构
[1] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine
关键词
artificial intelligence; machine learning; neural networks; paraphrase generation; pre-training; fine tuning;
D O I
10.1007/s10559-024-00658-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Paraphrase generation is a fundamental problem in natural language processing. Due to the significant success of transfer learning, the "pre-training -> fine-tuning" approach has become the standard. However, popular general pre-training methods typically require extensive datasets and great computational resources, and the available pre-trained models are limited by fixed architecture and size. The authors have proposed a simple and efficient approach to pre-training specifically for paraphrase generation, which noticeably improves the quality of paraphrase generation and ensures substantial enhancement of general-purpose models. They have used existing public data and new data generated by large language models. The authors have investigated how this pre-training procedure impacts neural networks of various architectures and demonstrated its efficiency across all architectures.
引用
收藏
页码:167 / 174
页数:8
相关论文
共 50 条
  • [1] Specialized Pre-Training of Neural Networks on Synthetic Data for Improving Paraphrase Generation
    O. H. Skurzhanskyi
    O. O. Marchenko
    A. V. Anisimov
    Cybernetics and Systems Analysis, 2024, 60 : 167 - 174
  • [2] Generative Pre-training for Paraphrase Generation by Representing and Predicting Spans in Exemplars
    Bui, Tien-Cuong
    Le, Van-Duc
    To, Hai-Thien
    Cha, Sang Kyun
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021), 2021, : 83 - 90
  • [3] Synthetic pre-training for neural-network interatomic potentials
    Gardner, John L. A.
    Baker, Kathryn T.
    Deringer, Volker L.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
  • [4] Pre-training on dynamic graph neural networks
    Chen, Ke-Jia
    Zhang, Jiajun
    Jiang, Linpu
    Wang, Yunyun
    Dai, Yuxuan
    NEUROCOMPUTING, 2022, 500 : 679 - 687
  • [5] Synthetic Training Data Generation for Convolutional Neural Networks in Vision Applications
    Vietz, Hannes
    Rauch, Tristan
    Weyrich, Michael
    2022 IEEE 27TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2022,
  • [6] PHGNN: Pre-Training Heterogeneous Graph Neural Networks
    Li, Xin
    Wei, Hao
    Ding, Yu
    IEEE ACCESS, 2024, 12 : 135411 - 135418
  • [7] Neural Networks for Sequential Data: a Pre-training Approach based on Hidden Markov Models
    Pasa, Luca
    Testolin, Alberto
    Sperduti, Alessandro
    NEUROCOMPUTING, 2015, 169 : 323 - 333
  • [8] Dynamic Pre-training of Deep Recurrent Neural Networks for Predicting Environmental Monitoring Data
    Ong, Bun Theang
    Sugiura, Komei
    Zettsu, Koji
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 760 - 765
  • [9] PSP: Pre-training and Structure Prompt Tuning for Graph Neural Networks
    Ge, Qingqing
    Zhao, Zeyuan
    Liu, Yiding
    Cheng, Anfeng
    Li, Xiang
    Wang, Shuaiqiang
    Yin, Dawei
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT V, ECML PKDD 2024, 2024, 14945 : 423 - 439
  • [10] GENERATION OF SYNTHETIC STRUCTURAL MAGNETIC RESONANCE IMAGES FOR DEEP LEARNING PRE-TRAINING
    Castro, Eduardo
    Ulloa, Alvaro
    Plis, Sergey M.
    Turner, Jessica A.
    Calhoun, Vince D.
    2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 1057 - 1060