Optimizing Search-Based Unit Test Generation with Large Language Models: An Empirical Study

被引：1

作者：

Xiao, Danni ^{[1
]}

Guo, Yimeng ^{[1
]}

Li, Yanhui ^{[1
]}

Chen, Lin ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China

来源：

PROCEEDINGS OF THE 15TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

Unit Test; Search-based Testing; Large Language Model; OPTIMIZATION;

D O I：

10.1145/3671016.3674813

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Search-based unit test generation methods have been considered effective and widely applied, and Large Language Models (LLMs) have also demonstrated their powerful generation ability. Therefore, some scholars have proposed using LLMs to enhance search-based unit test generation methods and have preliminarily confirmed that LLMs can help alleviate the problem of test coverage plateaus. However, it is still unclear when and how LLMs should intervene in the time-consuming test generation process. This paper explores the application of LLMs at various stages of search-based test generation (SBTG) (including the initial stage, the test generation period, and the test coverage plateaus), as well as strategies for controlling the frequency of LLM intervention. A comprehensive empirical study was conducted on 486 Python benchmark modules from 27 projects. The experimental results show that 1) LLM intervention has a positive effect at any stage, whether to improve coverage over a fixed period or to reduce the time to reach a specific coverage; 2) a reasonable intervention frequency is crucial for LLMs to have a positive effect on SBTG. This work can better help understand when and how LLMs should be applied in SBTG and provide valuable suggestions for developers in practice.

引用

页码：71 / 80

页数：10

共 40 条

[31]

Miller W., 1976, IEEE Transactions on Software Engineering, VSE-2, P223, DOI 10.1109/TSE.1976.233818

[32] Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets [J].

Panichella, Annibale ;

Kifetew, Fitsum Meshesha ;

Tonella, Paolo .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (02) :122-158

[33]

Schafer Max, 2023, IEEE Transactions on Software Engineering

[34]

Siddiq ML, 2024, Arxiv, DOI arXiv:2305.00418

[35] Guess What: Test Case Generation for Java']Javascript with Unsupervised Probabilistic Type Inference [J].

Stallenberg, Dimitri ;

Olsthoorn, Mitchell ;

Panichella, Annibale .

SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2022, 2022, 13711 :67-82

[36]

Tufano M, 2021, Arxiv, DOI arXiv:2009.05617

[37]

Van Larrhoven PT, 1988, Simulated annealing: theory and practice

[38] On Learning Meaningful Assert Statements for Unit Test Cases [J].

Watson, Cody ;

Tufano, Michele ;

Moran, Kevin ;

Bavota, Gabriele ;

Poshyvanyk, Denys .

2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, :1398-1409

[39] BugsInPy: A Database of Existing Bugs in Python']Python Programs to Enable Controlled Testing and Debugging Studies [J].

Widyasari, Ratnadira ;

Sim, Sheng Qin ;

Lok, Camellia ;

Qi, Haodi ;

Phan, Jack ;

Tay, Qijin ;

Tan, Constance ;

Wee, Fiona ;

Tan, Jodie Ethelda ;

Yieh, Yuheng ;

Goh, Brian ;

Thung, Ferdian ;

Kang, Hong Jin ;

Hoang, Thong ;

Lo, David ;

Ouh, Eng Lieh .

PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, :1556-1560

[40]

Yuan ZQ, 2023, Arxiv, DOI [arXiv:2305.04207, DOI 10.48550/ARXIV.2305.04207]

← 1 2 3 4 →