Assessing the Impact of Prompt Strategies on Text Summarization with Large Language Models

被引：0

作者：

Onan, Aytug ^{[1
]}

Alhumyani, Hesham ^{[2
]}

机构：

[1] Izmir Katip Celebi Univ, Fac Engn & Architecture, Dept Comp Engn, TR-35620 Izmir, Turkiye

[2] Taif Univ, Coll Comp & Informat Technol, Dept Comp Engn, POB 11099, Taif 21944, Saudi Arabia

来源：

COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, CAINE 2024 | 2025年 / 2242卷

关键词：

Large Language Models; Text Summarization; Prompt Strategies; Zero-shot Learning; One-shot Learning; Few-shot Learning; ROUGE; BLEU; BERTScore;

D O I：

10.1007/978-3-031-76273-4_4

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The advent of large language models (LLMs) has significantly advanced the field of text summarization, enabling the generation of coherent and contextually accurate summaries. This paper introduces a comprehensive framework for evaluating the performance of state-of-the-art LLMs in text summarization, with a particular focus on the impact of various prompt strategies, including zero-shot, one-shot, and few-shot learning. Our framework systematically examines how these prompting techniques influence summarization quality across diverse datasets, namely CNN/Daily Mail, XSum, TAC08, and TAC09. To provide a robust evaluation, we employ a range of intrinsic metrics such as ROUGE, BLEU, and BERTScore. These metrics allow us to quantify the quality of the generated summaries in terms of precision, recall, and semantic similarity. We evaluated three prominent LLMs: GPT-3, GPT-4, and LLaMA, each configured to optimize summarization performance under different prompting strategies. Our results reveal significant variations in performance depending on the chosen prompting strategy, highlighting the strengths and limitations of each approach. Furthermore, this study provides insights into the optimal conditions for employing different prompt strategies, offering practical guidelines for researchers and practitioners aiming to leverage LLMs for text summarization tasks. By delivering a thorough comparative analysis, we contribute to the understanding of how to maximize the potential of LLMs in generating high-quality summaries, ultimately advancing the field of natural language processing.

引用

页码：41 / 55

页数：15

共 22 条

[1] Abualigah L., 2020, Recent Advances in NLP: The Case of Arabic Language, V874, P1, DOI [DOI 10.1007/978-3-030-34614-0_1, 10.1007/978-3-030- 34614-0_1]
[2] Allahyari M, 2017, Arxiv, DOI arXiv:1707.02268
[3] A Survey on Evaluation of Large Language Models
Chang, Yupeng
Wang, Xu
Wang, Jindong
Wu, Yuan
Yang, Linyi
Zhu, Kaijie
Chen, Hao
Yi, Xiaoyuan
Wang, Cunxiang
Wang, Yidong
Ye, Wei
Zhang, Yue
Chang, Yi
Yu, Philip S.
Yang, Qiang
Xie, Xing
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
[4] Coda-Forno J, 2023, ADV NEUR IN
[5] GPT-3: What's it good for?
Dale, Robert
[J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 113 - 118
[6] Automatic text summarization: A comprehensive survey
El-Kassas, Wafaa S.
Salama, Cherif R.
Rafea, Ahmed A.
Mohamed, Hoda K.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[7] Recent automatic text summarization techniques: a survey
Gambhir, Mahak
Gupta, Vishal
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2017, 47 (01) : 1 - 66
[8] Gupta A., 2022, SUSTAINABLE ADV COMP, P249
[9] Hasan T, 2021, Arxiv, DOI arXiv:2106.13822
[10] Abstractive Text Summarization with Multi-Head Attention
Li, Jinpeng
Zhang, Chuang
Chen, Xiaojun
Cao, Yanan
Liao, Pengcheng
Zhang, Peng
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 →