Exploring Abstractive Text Summarization: Methods, Dataset, Evaluation, and Emerging Challenges

被引：0

作者：

Sunusi, Yusuf ^{[1
]}

Omar, Nazlia ^{[1
]}

Zakaria, Lailatul Qadri ^{[1
]}

机构：

[1] Univ Kebangsaan Malaysia, Ctr Artificial Intelligence Technol, Bangi 43600, Malaysia

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2024年 / 15卷 / 07期

关键词：

Abstractive text summarization; systematic literature review; natural language processing; evaluation metrics; dataset; computation linguistics; MODEL; RNN;

D O I：

10.14569/IJACSA.2024.01507130

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

-The latest advanced models for abstractive summarization, which utilize encoder-decoder frameworks, produce exactly one summary for each source text. This systematic literature review (SLR) comprehensively examines the recent advancements in abstractive text summarization (ATS), a pivotal area in natural language processing (NLP) that aims to generate concise and coherent summaries from extensive text sources. We delve into the evolution of ATS, focusing on key aspects such as encoder-decoder architectures, innovative mechanisms like attention and pointer-generator models, training and optimization methods, datasets, and evaluation metrics. Our review analyzes a wide range of studies, highlighting the transition from traditional sequence-to-sequence models to more advanced approaches like Transformer-based architectures. We explore the integration of mechanisms such as attention, which enhances model interpretability and effectiveness, and pointer-generator networks, which adeptly balance between copying and generating text. The review also addresses the challenges in training these models, including issues related to dataset quality and diversity, particularly in low-resource languages. A critical analysis of evaluation metrics reveals a heavy reliance on ROUGE scores, prompting a discussion on the need for more nuanced evaluation methods that align closely with human judgment. Additionally, we identify and discuss emerging research gaps, such as the need for effective summary length control and the handling of model hallucination, which are crucial for the practical application of ATS. This SLR not only synthesizes current research trends and methodologies in ATS, but also provides insights into future directions, underscoring the importance of continuous innovation in model development, dataset enhancement, and evaluation strategies. Our findings aim to guide researchers and practitioners in navigating the evolving landscape of abstractive text summarization and in identifying areas ripe for future exploration and development.

引用

页码：1340 / 1357

页数：18

共 84 条

[51] Nie F, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2673
[52] Comparative Study of Extractive Text Summarization Techniques
Palliyali, Ahammed Waseem
Al-Khalifa, Maaz Abdulaziz
Farooq, Saad
Abinahed, Julien
Al-Ansari, Abdulla
Jaoua, Ali
[J]. 2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,
[53] BLEU: a method for automatic evaluation of machine translation
Papineni, K
Roukos, S
Ward, T
Zhu, WJ
[J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
[54] Preethi S., 2022, 2022 International Conference on Edge Computing and Applications (ICECAA), P1605, DOI 10.1109/ICECAA55415.2022.9936215
[55] Raffel C, 2020, J MACH LEARN RES, V21
[56] A Survey on Deep Learning based Various Methods Analysis of Text Summarization
Rahul
Rauniyar, Shristi
Monika
[J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT-2020), 2020, : 113 - 116
[57] Ravaut M, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P4504
[58] Generation of Highlights From Research Papers Using Pointer-Generator Networks and SciBERT Embeddings
Rehman, Tohida
Sanyal, Debarshi Kumar
Chattopadhyay, Samiran
Bhowmick, Plaban Kumar
Das, Partha Pratim
[J]. IEEE ACCESS, 2023, 11 : 91358 - 91374
[59] Ren S., 2019, Pointer-Generator Abstractive Text Summarization Model with Part of Speech Features, DOI [10.1109/ICSESS47205.2019.9040715, DOI 10.1109/ICSESS47205.2019.9040715]
[60] Rush Alexander M., 2015, P 2015 C EMP METH NA, P379, DOI DOI 10.18653/V1/D15-1044

← 1 2 3 4 5 6 7 8 9 →