Exploring Abstractive Text Summarization: Methods, Dataset, Evaluation, and Emerging Challenges

被引：0

作者：

Sunusi, Yusuf ^{[1
]}

Omar, Nazlia ^{[1
]}

Zakaria, Lailatul Qadri ^{[1
]}

机构：

[1] Univ Kebangsaan Malaysia, Ctr Artificial Intelligence Technol, Bangi 43600, Malaysia

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2024年 / 15卷 / 07期

关键词：

Abstractive text summarization; systematic literature review; natural language processing; evaluation metrics; dataset; computation linguistics; MODEL; RNN;

D O I：

10.14569/IJACSA.2024.01507130

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

-The latest advanced models for abstractive summarization, which utilize encoder-decoder frameworks, produce exactly one summary for each source text. This systematic literature review (SLR) comprehensively examines the recent advancements in abstractive text summarization (ATS), a pivotal area in natural language processing (NLP) that aims to generate concise and coherent summaries from extensive text sources. We delve into the evolution of ATS, focusing on key aspects such as encoder-decoder architectures, innovative mechanisms like attention and pointer-generator models, training and optimization methods, datasets, and evaluation metrics. Our review analyzes a wide range of studies, highlighting the transition from traditional sequence-to-sequence models to more advanced approaches like Transformer-based architectures. We explore the integration of mechanisms such as attention, which enhances model interpretability and effectiveness, and pointer-generator networks, which adeptly balance between copying and generating text. The review also addresses the challenges in training these models, including issues related to dataset quality and diversity, particularly in low-resource languages. A critical analysis of evaluation metrics reveals a heavy reliance on ROUGE scores, prompting a discussion on the need for more nuanced evaluation methods that align closely with human judgment. Additionally, we identify and discuss emerging research gaps, such as the need for effective summary length control and the handling of model hallucination, which are crucial for the practical application of ATS. This SLR not only synthesizes current research trends and methodologies in ATS, but also provides insights into future directions, underscoring the importance of continuous innovation in model development, dataset enhancement, and evaluation strategies. Our findings aim to guide researchers and practitioners in navigating the evolving landscape of abstractive text summarization and in identifying areas ripe for future exploration and development.

引用

页码：1340 / 1357

页数：18

共 84 条

[1] Abdelwahab M. Y., 2023, AsiaPacific Journal of Information Technology and Multimedia, V12, P70, DOI [10.17576/apjitm-2023-1201-05, DOI 10.17576/APJITM-2023-1201-05]
[2] Adams G, 2023, PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, P10520, DOI 10.18653/v1/2023.acl-long.587
[3] Afzal A., 2023, P 15 INT C AG ART IN, P682, DOI [10.5220/0011744500003393, DOI 10.5220/0011744500003393]
[4] Ajie Prasetya, 2022, arXiv
[5] Algehed M, 2020, Arxiv, DOI arXiv:2005.12345
[6] Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification
Banerjee, Imon
Ling, Yuan
Chen, Matthew C.
Hasan, Sadid A.
Langlotz, Curtis P.
Moradzadeh, Nathaniel
Chapman, Brian
Amrhein, Timothy
Mong, David
Rubin, Daniel L.
Farri, Oladimeji
Lungren, Matthew P.
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 97 : 79 - 88
[7] Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[8] Turkish abstractive text summarization using pretrained sequence-to-sequence models
Baykara, Batuhan
Gungor, Tunga
[J]. NATURAL LANGUAGE ENGINEERING, 2023, 29 (05) : 1275 - 1304
[9] bin Rodzman SB, 2019, 2019 IEEE 9TH SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE), P299, DOI [10.1109/iscaie.2019.8743988, 10.1109/ISCAIE.2019.8743988]
[10] Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages
Blekanov, Ivan S.
Tarasov, Nikita
Bodrunova, Svetlana S.
[J]. FUTURE INTERNET, 2022, 14 (03)

← 1 2 3 4 5 6 7 8 9 →