Representation transfer and data cleaning in multi-views for text simplification

被引:1
|
作者
He, Wei [1 ,2 ]
Farrahi, Katayoun [1 ]
Chen, Bin [3 ]
Peng, Bohua [2 ]
Villavicencio, Aline [2 ]
机构
[1] Univ Southampton, Dept Elect & Comp Sci, Southampton SO17 1BJ, England
[2] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, England
[3] Univ Sheffield, Dept Automatic Control & Syst Engn, Sheffield S1 3JD, England
基金
英国工程与自然科学研究理事会;
关键词
Text simplification; Sentence representation; Pre-trained language model; Data cleaning; Decoding;
D O I
10.1016/j.patrec.2023.11.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Representation transfer is a widely used technique in natural language processing. We propose methods of cleaning the dominant dataset of text simplification (TS) WikiLarge in multi-views to remove errors that impact model training and fine-tuning. The results show that our method can effectively refine the dataset. We propose to take the pre-trained text representations from a similar task (e.g., text summarization) to text simplification to conduct a continue-fine-tuning strategy to improve the performance of pre-trained models on TS. This approach will speed up the training and make the model convergence easier. Besides, we also propose a new decoding strategy for simple text generation. It is able to generate simpler and more comprehensible text with controllable lexical simplicity. The experimental results show that our method can achieve good performance on many evaluation metrics.
引用
收藏
页码:40 / 46
页数:7
相关论文
共 50 条
  • [1] Volumetric representation for sparse multi-views
    Anantrasirichai, N.
    Canagarajah, C. Nishan
    Redmill, David W.
    Bull, David R.
    2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, : 1221 - +
  • [2] Multi-views contrastive learning for dense text retrieval
    Yu, Yang
    Zeng, Jun
    Zhong, Lin
    Gao, Min
    Wen, Junhao
    Wu, Yingbo
    KNOWLEDGE-BASED SYSTEMS, 2023, 274
  • [3] Integrated multi-views
    Carrico, L
    Guimaraes, N
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 1998, 9 (03): : 287 - 297
  • [4] Sentiment analysis of tweets using text and graph multi-views learning
    Singh, Loitongbam Gyanendro
    Singh, Sanasam Ranbir
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (05) : 2965 - 2985
  • [5] Sentiment analysis of tweets using text and graph multi-views learning
    Loitongbam Gyanendro Singh
    Sanasam Ranbir Singh
    Knowledge and Information Systems, 2024, 66 : 2965 - 2985
  • [6] A multi-views repository for multi-structured documents
    Djemal, Karim
    ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: DATABASES AND INFORMATION SYSTEMS INTEGRATION, 2007, : 544 - 548
  • [7] Multi-views reconstruction based on cell primitives
    Geng, WD
    Wang, JB
    Zhang, YY
    Pan, YH
    CISST'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS, AND TECHNOLOGY, VOLS I AND II, 2000, : 459 - 465
  • [8] Finding the Correspondence Points in Images of Multi-Views
    Anvar, Seyed Mohammad Hassan
    Yau, Wei-Yun
    Teoh, Eam Khwang
    8TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), 2012, : 275 - 280
  • [9] Study on Multi-Views Point Clouds Registration
    Liang Xinhe
    Liang Jin
    Xiao Zhenzhong
    Liu Jianwei
    Guo Cheng
    ADVANCED SCIENCE LETTERS, 2011, 4 (8-10) : 2885 - 2889
  • [10] Continuous multi-views tracking using tensor voting
    Kang, JM
    Cohen, I
    Medioni, G
    IEEE WORKSHOP ON MOTION AND VIDEO COMPUTING (MOTION 2002), PROCEEDINGS, 2002, : 181 - 186