Assessing the Impact of Prompt Strategies on Text Summarization with Large Language Models

被引:0
作者
Onan, Aytug [1 ]
Alhumyani, Hesham [2 ]
机构
[1] Izmir Katip Celebi Univ, Fac Engn & Architecture, Dept Comp Engn, TR-35620 Izmir, Turkiye
[2] Taif Univ, Coll Comp & Informat Technol, Dept Comp Engn, POB 11099, Taif 21944, Saudi Arabia
来源
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, CAINE 2024 | 2025年 / 2242卷
关键词
Large Language Models; Text Summarization; Prompt Strategies; Zero-shot Learning; One-shot Learning; Few-shot Learning; ROUGE; BLEU; BERTScore;
D O I
10.1007/978-3-031-76273-4_4
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The advent of large language models (LLMs) has significantly advanced the field of text summarization, enabling the generation of coherent and contextually accurate summaries. This paper introduces a comprehensive framework for evaluating the performance of state-of-the-art LLMs in text summarization, with a particular focus on the impact of various prompt strategies, including zero-shot, one-shot, and few-shot learning. Our framework systematically examines how these prompting techniques influence summarization quality across diverse datasets, namely CNN/Daily Mail, XSum, TAC08, and TAC09. To provide a robust evaluation, we employ a range of intrinsic metrics such as ROUGE, BLEU, and BERTScore. These metrics allow us to quantify the quality of the generated summaries in terms of precision, recall, and semantic similarity. We evaluated three prominent LLMs: GPT-3, GPT-4, and LLaMA, each configured to optimize summarization performance under different prompting strategies. Our results reveal significant variations in performance depending on the chosen prompting strategy, highlighting the strengths and limitations of each approach. Furthermore, this study provides insights into the optimal conditions for employing different prompt strategies, offering practical guidelines for researchers and practitioners aiming to leverage LLMs for text summarization tasks. By delivering a thorough comparative analysis, we contribute to the understanding of how to maximize the potential of LLMs in generating high-quality summaries, ultimately advancing the field of natural language processing.
引用
收藏
页码:41 / 55
页数:15
相关论文
共 22 条
  • [1] Abualigah L., 2020, Recent Advances in NLP: The Case of Arabic Language, V874, P1, DOI [DOI 10.1007/978-3-030-34614-0_1, 10.1007/978-3-030- 34614-0_1]
  • [2] Allahyari M, 2017, Arxiv, DOI arXiv:1707.02268
  • [3] A Survey on Evaluation of Large Language Models
    Chang, Yupeng
    Wang, Xu
    Wang, Jindong
    Wu, Yuan
    Yang, Linyi
    Zhu, Kaijie
    Chen, Hao
    Yi, Xiaoyuan
    Wang, Cunxiang
    Wang, Yidong
    Ye, Wei
    Zhang, Yue
    Chang, Yi
    Yu, Philip S.
    Yang, Qiang
    Xie, Xing
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
  • [4] Coda-Forno J, 2023, ADV NEUR IN
  • [5] GPT-3: What's it good for?
    Dale, Robert
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (01) : 113 - 118
  • [6] Automatic text summarization: A comprehensive survey
    El-Kassas, Wafaa S.
    Salama, Cherif R.
    Rafea, Ahmed A.
    Mohamed, Hoda K.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
  • [7] Recent automatic text summarization techniques: a survey
    Gambhir, Mahak
    Gupta, Vishal
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2017, 47 (01) : 1 - 66
  • [8] Gupta A., 2022, SUSTAINABLE ADV COMP, P249
  • [9] Hasan T, 2021, Arxiv, DOI arXiv:2106.13822
  • [10] Abstractive Text Summarization with Multi-Head Attention
    Li, Jinpeng
    Zhang, Chuang
    Chen, Xiaojun
    Cao, Yanan
    Liao, Pengcheng
    Zhang, Peng
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,