Contrasting test selection, prioritization, and batch testing at scale

被引：0

作者：

Fallahzadeh, Emad ^{[1
]}

Rigby, Peter C. ^{[2
]}

Adams, Bram ^{[1
]}

机构：

[1] Queens Univ, Sch Comp, Kingston, ON, Canada

[2] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ, Canada

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2025年 / 30卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Test selection; Test prioritization; Test batching; Test optimization; Parallel testing; Chrome testing; TRAVIS CI; BUILD;

D O I：

10.1007/s10664-024-10589-8

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The effectiveness of software testing is crucial for successful software releases, and various test optimization techniques aim to enhance this process by reducing the number of test executions or prioritizing potential test failures. Although different families of techniques exist, each with its own evaluation criteria, few studies have compared these different lines of research. This study addresses this gap by empirically comparing Yaraghi et al.'s test prioritization approach, Zhu et al.'s cross-build test prioritization and its equivalent test selection technique, and our BatchAll test batching algorithm. To evaluate these test optimization approaches, we empirically analyze millions of test results from Google Chrome, along with pre- and post-commit test outcomes for a Google project, as well as the JMRI Travis CI dataset. Findings reveal that test selection can reduce actual median feedback time by up to 96% with the same number of machines but may miss up to 55% of failures. In contrast, batching achieves up to a 99% reduction in feedback time without missing any failures. Test selection cuts machine usage by up to 66%, while batching achieves up to an 88% reduction. For failure detection, the test selection is up to 62 minutes faster than the baseline, and the batching algorithm achieves up to a 63-minute median improvement without missing failures. Regarding test execution time, test selection saves up to 66%, whereas batching's saving can reach up to 98%, although its performance varies based on the machines used. The studied test prioritization algorithms significantly underperform compared to the test selection and batching algorithms. In conclusion, this study provides practical recommendations for selecting appropriate test optimization algorithms based on the testing environment and failure loss tolerance.

引用

页数：65

共 56 条

[1] FlakeFlagger: Predicting Flakiness Without Rerunning Tests
Alshammari, Abdulrahman
Morris, Christopher
Hilton, Michael
Bell, Jonathan
[J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 1572 - 1584
[2] Anderson J., 2014, P 11 WORK C MIN SOFT, P142, DOI DOI 10.1145/2597073.2597084
[3] AutoPar-Clava: An Automatic Parallelization source-to-source tool for C code applications
Arabnejad, Hamid
Bispo, Joao
Barbosa, Jorge G.
Cardoso, Joao M. P.
[J]. PARMA-DITAM 2018: 9TH WORKSHOP ON PARALLEL PROGRAMMING AND RUNTIME MANAGEMENT TECHNIQUES FOR MANY-CORE ARCHITECTURES AND 7TH WORKSHOP ON DESIGN TOOLS AND ARCHITECTURES FOR MULTICORE EMBEDDED COMPUTING PLATFORMS, 2018, : 13 - 19
[4] Reinforcement Learning for Test Case Prioritization
Bagherzadeh, Mojtaba
Kahani, Nafiseh
Briand, Lionel
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2836 - 2856
[5] Bagies TOS, 2020, Parallelizing unit test execution on gpu
[6] Mining Historical Test Failures to Dynamically Batch Tests to Save CI Resources
Bavand, Amir Hossein
Rigby, Peter C.
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, : 217 - 226
[7] Software Batch Testing to Save Build Test Resources and to Reduce Feedback Time
Beheshtian, Mohammad Javad
Bavand, Amir Hossein
Rigby, Peter C.
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2784 - 2801
[8] DEFLAKER: Automatically Detecting Flaky Tests
Bell, Jonathan
Legunsen, Owolabi
Hilton, Michael
Eloussi, Lamyaa
Yung, Tifany
Marinov, Darko
[J]. PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 433 - 444
[9] Efficient Dependency Detection for Safe Java']Java Test Acceleration
Bell, Jonathan
Kaiser, Gail
Melski, Eric
Dattatreya, Mohan
[J]. 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 770 - 781
[10] Oops, My Tests Broke the Build: An Explorative Analysis of Travis CI with GitHub
Beller, Moritz
Gousios, Georgios
Zaidman, Andy
[J]. 2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, : 356 - 367

← 1 2 3 4 5 6 →