ParaSum: Contrastive Paraphrasing for Low-Resource Extractive Text Summarization

被引:0
作者
Tang, Moming [1 ]
Wang, Chengyu [2 ]
Wang, Jianing [1 ]
Chen, Cen [1 ]
Gao, Ming [1 ]
Qian, Weining [1 ]
机构
[1] East China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, KSEM 2023 | 2023年 / 14119卷
基金
中国国家自然科学基金;
关键词
low-resource scenarios; extractive summarization; textual paraphrasing; transfer learning; pre-trained language model;
D O I
10.1007/978-3-031-40289-0_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing extractive summarization methods achieve state-ofthe-art (SOTA) performance with pre-trained language models (PLMs) and sufficient training data. However, PLM-based methods are known to be data-hungry and often fail to deliver satisfactory results in low-resource scenarios. Constructing a high-quality summarization dataset with human-authored reference summaries is a prohibitively expensive task. To address these challenges, this paper proposes a novel paradigm for low-resource extractive summarization, called ParaSum. This paradigm reformulates text summarization as textual paraphrasing, aligning the text summarization task with the self-supervised Next Sentence Prediction (NSP) task of PLMs. This approach minimizes the training gap between the summarization model and PLMs, enabling a more effective probing of the knowledge encoded within PLMs and enhancing the summarization performance. Furthermore, to relax the requirement for large amounts of training data, we introduce a simple yet efficient model and align the training paradigm of summarization to textual paraphrasing to facilitate network-based transfer learning. Extensive experiments over two widely used benchmarks (i.e., CNN/DailyMail, Xsum) and a recent open-sourced high-quality Chinese benchmark (i.e., CNewSum) show that ParaSum consistently outperforms existing PLM-based summarization methods in all low-resource settings, demonstrating its effectiveness over different types of datasets.
引用
收藏
页码:106 / 119
页数:14
相关论文
共 34 条
  • [1] Ben-Zaken E, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, P1
  • [2] A LARGE-SCALE CHINESE LONG-TEXT EXTRACTIVE SUMMARIZATION CORPUS
    Chen, Kai
    Fu, Guanyu
    Chen, Qingcai
    Hu, Baotian
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7828 - 7832
  • [3] Chen YC, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P675
  • [4] SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization
    Chen, Yi-Syuan
    Song, Yun-Zhu
    Shuai, Hong-Han
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 603 - 618
  • [5] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [6] Ganin Y, 2016, J MACH LEARN RES, V17
  • [7] Gao P., 2021, arXiv
  • [8] Gao TY, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P3816
  • [9] Gu NL, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P6507
  • [10] Hermann KM, 2015, ADV NEUR IN, V28