Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning

被引：21

作者：

Lamsiyah, Salima ^{[1
]}

El Mahdaouy, Abdelkader ^{[3
]}

Ouatik, Said El Alaoui ^{[1
,2
]}

Espinasse, Bernard ^{[4
]}

机构：

[1] Sidi Mohamed Ben Abdellah Univ, FSDM, Lab Informat Signals Automat & Cognitivism, BP 1796, Fez Atlas 30003, Morocco

[2] Ibn Tofail Univ, Natl Sch Appl Sci, Lab Engn Sci, Kenitra, Morocco

[3] Mohammed VI Polytech Univ UM6P, Sch Comp Sci UM6P CS, Ben Guerir, Morocco

[4] Univ Toulon & Var, Aix Marseille Univ, CNRS, LIS,UMR 7020, Toulon, France

来源：

JOURNAL OF INFORMATION SCIENCE | 2023年 / 49卷 / 01期

关键词：

BERT fine-tuning; multi-document summarization; multi-task learning; sentence representation learning; transfer learning; SENTENCE SCORING TECHNIQUES;

D O I：

10.1177/0165551521990616

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC'2002-2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning-based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.

引用

页码：164 / 182

页数：19

共 50 条

[1] Unsupervised Framework for Comment-based Multi-document Extractive Summarization
Roha, Vishal Singh
Saini, Naveen
Saha, Sriparna
Moreno, Jose G.
PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 574 - 582
[2] An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings
Lamsiyah, Salima
El Mahdaouy, Abdelkader
Espinasse, Bernard
Ouatik, Said El Alaoui
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
[3] A Spectral Method for Unsupervised Multi-Document Summarization
Wang, Kexiang
Chang, Baobao
Sui, Zhifang
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 435 - 445
[4] Multi-document summarization based on unsupervised clustering
Ji, Paul
INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
[5] Multi-Task Learning for Abstractive and Extractive Summarization
Chen, Yangbin
Ma, Yun
Mao, Xudong
Li, Qing
DATA SCIENCE AND ENGINEERING, 2019, 4 (01) : 14 - 23
[6] Multi-Task Learning for Abstractive and Extractive Summarization
Yangbin Chen
Yun Ma
Xudong Mao
Qing Li
Data Science and Engineering, 2019, 4 (1) : 14 - 23
[7] Unsupervised extractive multi-document text summarization using a Genetic Algorithm
Neri-Mendoza, Veronica
Ledeneva, Yulia
Garcia-Hernandez, Rene Arnulfo
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2397 - 2408
[8] Mining Topically Coherent Patterns for Unsupervised Extractive Multi-document Summarization
Wu, Yutong
Li, Yuefeng
Xu, Yue
Huang, Wei
2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), 2016, : 129 - 136
[9] Sentiment-aware Review Summarization with Personalized Multi-task Fine-tuning
Xu, Hongyan
Liu, Hongtao
Lv, Zhepeng
Yang, Qing
Wang, Wenjun
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 2826 - 2835
[10] Multi-document extractive text summarization based on firefly algorithm
Tomer, Minakshi
Kumar, Manoj
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6057 - 6065

← 1 2 3 4 5 →