Software System Testing Assisted by Large Language Models: An Exploratory Study

被引:0
|
作者
Augusto, Cristian [1 ]
Moran, Jesus [1 ]
Bertolino, Antonia [2 ]
de la Riva, Claudio [1 ]
Tuya, Javier [1 ]
机构
[1] Univ Oviedo, Comp Sci Dept, Gijon, Spain
[2] ISTI CNR, Consiglio Nazl Ric, Pisa, Italy
来源
TESTING SOFTWARE AND SYSTEMS, ICTSS 2024 | 2025年 / 15383卷
关键词
Large Language Model; Software Testing; System Testing; Test Cases; Test Scenarios;
D O I
10.1007/978-3-031-80889-0_17
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large language models (LLMs) based on transformer architecture have revolutionized natural language processing (NLP), demonstrating excellent capabilities in understanding and generating human-like text. In Software Engineering, LLMs have been applied in code generation, documentation, and report writing tasks, to support the developer and reduce the amount of manual work. In Software Testing, one of the cornerstones of Software Engineering, LLMs have been explored for generating test code, test inputs, automating the oracle process or generating test scenarios. However, their application to high-level testing stages such as system testing, in which a deep knowledge of the business and the technological stack is needed, remains largely unexplored. This paper presents an exploratory study about how LLMs can support system test development. Given that LLM performance depends on input data quality, the study focuses on how to query general purpose LLMs to first obtain test scenarios and then derive test cases from them. The study evaluates two popular LLMs (GPT-4o and GPT-4o-mini), using as a benchmark a European project demonstrator. The study compares two different prompt strategies and employs well-established prompt patterns, showing promising results as well as room for improvement in the application of LLMs to support system testing.
引用
收藏
页码:239 / 255
页数:17
相关论文
共 50 条
  • [31] Evaluation of large language models for the classification of medical device software
    Yu Han
    Aaron Ceross
    Florence Bourgeois
    Paulo Savaget
    Jeroen HMBergmann
    Bio-Design and Manufacturing, 2024, 7 (05) : 819 - 822
  • [32] Evaluation of large language models for the classification of medical device software
    Han, Yu
    Ceross, Aaron
    Bourgeois, Florence
    Savaget, Paulo
    Bergmann, Jeroen H. M.
    BIO-DESIGN AND MANUFACTURING, 2024, 7 (05) : 819 - 822
  • [33] Large Language Models for Software Engineering: A Systematic Literature Review
    Hou, Xinyi
    Zhao, Yanjie
    Liu, Yue
    Yang, Zhou
    Wang, Kailong
    Li, Li
    Luo, Xiapu
    Lo, David
    Grundy, John
    Wang, Haoyu
    ACM Transactions on Software Engineering and Methodology, 2024, 33 (08)
  • [34] When Software Security Meets Large Language Models: A Survey
    Zhu, Xiaogang
    Zhou, Wei
    Han, Qing-Long
    Ma, Wanlun
    Wen, Sheng
    Xiang, Yang
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2025, 12 (02) : 317 - 334
  • [35] When Software Security Meets Large Language Models:A Survey
    Xiaogang Zhu
    Wei Zhou
    QingLong Han
    Wanlun Ma
    Sheng Wen
    Yang Xiang
    IEEE/CAA Journal of Automatica Sinica, 2025, 12 (02) : 317 - 334
  • [36] Can Large Language Models Better Predict Software Vulnerability?
    Katsadouros, Evangelos
    Patrikakis, Charalampos Z.
    Hurlburt, George
    IT PROFESSIONAL, 2023, 25 (03) : 4 - 8
  • [37] Towards an understanding of large language models in software engineering tasks
    Zheng, Zibin
    Ning, Kaiwen
    Zhong, Qingyuan
    Chen, Jiachi
    Chen, Wenqing
    Guo, Lianghong
    Wang, Weicheng
    Wang, Yanlin
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (02)
  • [38] Evaluating Explanations for Software Patches Generated by Large Language Models
    Sobania, Dominik
    Geiger, Alina
    Callan, James
    Brownlee, Alexander
    Hanna, Carol
    Moussa, Rebecca
    Lopez, Mar Zamorano
    Petke, Justyna
    Sarro, Federica
    SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2023, 2024, 14415 : 147 - 152
  • [39] AthenaLLM: Supporting Experiments with Large Language Models in Software Development
    de Oliveira, Benedito
    Castor, Fernando
    PROCEEDINGS 2024 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC 2024, 2024, : 69 - 73
  • [40] A Proposal of a Software Defect Predication System "FaRSeT-#" for Exploratory Testing
    Kita, Yoshihiro
    Ueda, Kazuki
    Sakurai, Kiyotaka
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2022, 9 (02): : 115 - 120