LTM: Scalable and Black-Box Similarity-Based Test Suite Minimization Based on Language Models

被引:0
作者
Pan, Rongqi [1 ]
Ghaleb, Taher A. [2 ,3 ]
Briand, Lionel C. [4 ,5 ]
机构
[1] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada
[2] Trent Univ, Comp Sci Dept, Peterborough, ON K9L 0G2, Canada
[3] Univ Ottawa, Ottawa, ON K1N 6N5, Canada
[4] Univ Limerick, Lero SFI Ctr Software Res, Limerick V94T9PX, Ireland
[5] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada
基金
爱尔兰科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Minimization; Codes; Fault detection; Closed box; Scalability; Time measurement; Genetic algorithms; Source coding; Vectors; Unified modeling language; Test suite minimization; test suite reduction; pre-trained language models; genetic algorithm; black-box testing; SELECTION; PRIORITIZATION;
D O I
10.1109/TSE.2024.3469582
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Test suites tend to grow when software evolves, making it often infeasible to execute all test cases with the allocated testing budgets, especially for large software systems. Test suite minimization (TSM) is employed to improve the efficiency of software testing by removing redundant test cases, thus reducing testing time and resources while maintaining the fault detection capability of the test suite. Most existing TSM approaches rely on code coverage (white-box) or model-based features, which are not always available to test engineers. Recent TSM approaches that rely only on test code (black-box) have been proposed, such as ATM and FAST-R. The former yields higher fault detection rates (FDR) while the latter is faster. To address scalability while retaining a high FDR, we propose LTM (<bold>L</bold>anguage model-based<bold> </bold>Test suite Minimization), a novel, scalable, and black-box similarity-based TSM approach based on large language models (LLMs), which is the first application of LLMs in the context of TSM. To support similarity measurement using test method embeddings, we investigate five different pre-trained language models: CodeBERT, GraphCodeBERT, UniXcoder, StarEncoder, and CodeLlama, on which we compute two similarity measures: Cosine Similarity and Euclidean Distance. Our goal is to find similarity measures that are not only computationally more efficient but can also better guide a Genetic Algorithm (GA), which is used to search for optimal minimized test suites, thus reducing the overall search time. Experimental results show that the best configuration of LTM (UniXcoder/Cosine) outperforms ATM in three aspects: (a) achieving a slightly greater saving rate of testing time ($41.72\%$41.72% versus $41.02\%$41.02%, on average); (b) attaining a significantly higher fault detection rate ($0.84$0.84 versus $0.81$0.81, on average); and, most importantly, (c) minimizing test suites nearly five times faster on average, with higher gains for larger test suites and systems, thus achieving much higher scalability.
引用
收藏
页码:3053 / 3070
页数:18
相关论文
共 50 条
  • [31] Web-based Automated Black-Box Testing Framework for Component Based Robot Software
    Kang, Jeong Seok
    Park, Hong Seong
    [J]. UBICOMP'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, 2012, : 852 - 859
  • [32] Exploring Vulnerabilities of No-Reference Image Quality Assessment Models: A Query-Based Black-Box Method
    Yang, Chenxi
    Liu, Yujia
    Li, Dingquan
    Jiang, Tingting
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12715 - 12729
  • [33] RESTest: Black-Box Constraint-Based Testing of RESTful Web APIs
    Martin-Lopez, Alberto
    Segura, Sergio
    Ruiz-Cortes, Antonio
    [J]. SERVICE-ORIENTED COMPUTING (ICSOC 2020), 2020, 12571 : 459 - 475
  • [34] Lessons From the Black-Box: Fast Crossover-Based Genetic Algorithms
    Doerr, Benjamin
    Doerr, Carola
    Ebel, Franziska
    [J]. GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 781 - 788
  • [35] Evaluating several path-based partial dynamic analysis methods for selecting black-box generated test cases
    Chan, YK
    Yu, YT
    [J]. QSIC 2004: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, 2004, : 70 - 78
  • [36] Comparing and Combining File-Based Selection and Similarity-Based Prioritization Towards Regression Test Orchestration
    Greca, Renan
    Miranda, Breno
    Gligoric, Milos
    Bertolino, Antonia
    [J]. 3RD ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST (AST 2022), 2022, : 115 - 125
  • [37] A similarity-based multi-objective test optimization technique using search algorithm
    Habib, Amir Sohail
    Khan, Saif Ur Rehman
    Hussain, Shahid
    Ibrahim, Naseem
    Nisa, Habib un
    Yousafzai, Abdullah
    [J]. SYSTEMS AND SOFT COMPUTING, 2024, 6
  • [38] A surrogate-based cooperative optimization framework for computationally expensive black-box problems
    Garcia-Garcia, Jose Carlos
    Garcia-Rodenas, Ricardo
    Codina, Esteve
    [J]. OPTIMIZATION AND ENGINEERING, 2020, 21 (03) : 1053 - 1093
  • [39] Arborescent Orthogonal Least Squares Regression for NARMAX-Based Black-Box Fitting
    Thunus, Stephane J. P. S.
    Parker, Julian D.
    Weinzierl, Stefan
    [J]. IEEE ACCESS, 2024, 12 : 155578 - 155597
  • [40] Ontology Based Test Case Generation for Black Box Testing
    Ul Haq, Sami
    Qamar, Usman
    [J]. PROCEEDINGS OF 2019 8TH INTERNATIONAL CONFERENCE ON EDUCATIONAL AND INFORMATION TECHNOLOGY (ICEIT 2019), 2019, : 236 - 241