Software Batch Testing to Save Build Test Resources and to Reduce Feedback Time

被引：10

作者：

Beheshtian, Mohammad Javad ^{[1
]}

Bavand, Amir Hossein ^{[1
]}

Rigby, Peter C. ^{[1
]}

机构：

[1] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ H3G 1M8, Canada

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2022年 / 48卷 / 08期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Software testing; batch testing; continuous integration and deployment; bisection; pool testing; reducing testing cost; risk modelling; BUG; PRIORITIZATION; METRICS;

D O I：

10.1109/TSE.2021.3070269

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Testing is expensive and batching tests has the potential to reduce test costs. The continuous integration strategy of testing each commit or change individually helps to quickly identify faults but leads to a maximal number of test executions. Large companies that have a massive number of commits, e.g., Google and Facebook, or have expensive test infrastructure, e.g., Ericsson, must batch changes together to reduce the number of total test runs. For example, if eight builds are batched together and there is no failure, then we have tested eight builds with one execution saving seven executions. However, when a failure occurs it is not immediately clear which build is the cause of the failure. A bisection is run to isolate the failing build, i.e., the culprit build. In our eight builds example, a failure will require an additional 6 executions, resulting in a saving of one execution. In this work, we re-evaluate batching approaches developed in industry on large open source projects using Travis CI. We also introduce novel batching approaches. In total, we evaluate six approaches. The first is the baseline approach that tests each build individually. The second, is the existing bisection approach. The third uses a batch size of four, which we show mathematically reduces the number of execution without requiring bisection. The fourth combines the two prior techniques introducing a stopping condition to the bisection. The final two approaches use models of build change risk to isolate risky changes and test them in smaller batches. We find that compared to the TestAll baseline, on average, the approaches reduce the number of build test executions across projects by 46, 48, 50, 44, and 49 percent for BatchBisect, Batch4, BatchStop4, RiskTopN, and RiskBatch, respectively. The greatest reduction in executions is BatchStop4 at 50 percent. However, the simple approach of Batch4 does not require bisection and achieves a reduction of 48 percent. In a larger sample of projects, we find that a project's failure rate is strongly correlated with execution savings (Spearman r = -0.97 with a p << 0.001). Using Batch4, 85 percent of projects see savings. All projects that have build failures less than 40 percent of the time will benefit from batching. In terms of feedback time, compared to TestAll, we find that BatchBisect, Batch2, Batch4, BatchStop4 all reduce the average feedback time by 33, 16, 32, and 37 percent. Simple batching saves not only resources but also reduces feedback time without introducing any slip-throughs and without changing the test run order. We suggest that most projects should adjust their CI pipelines to use a batch size of at least two. We release our scripts and data for replication(1) as well as the BatchBuilder tool(2) that automatically batches submitted changes on GitHub for testing on Travis CI. Since the tool reports individual results for each pull-request or pushed commit, the batching happens in the background and the development process is unchanged.

引用

页码：2784 / 2801

页数：18

共 70 条

[1] Alexeevich B. A., 2019, Patent App., Patent No. [16/206,311, 16206311]
[2] [Anonymous], 2020, TRAVIS CI QUEUE DASH
[3] [Anonymous], 2015, git-bisect Manual Page
[4] Optimization of group size in pool testing strategy for SARS-CoV-2: A simple mathematical model
Aragon-Caqueo, Diego
Fernandez-Salinas, Javier
Laroze, David
[J]. JOURNAL OF MEDICAL VIROLOGY, 2020, 92 (10) : 1988 - 1994
[5] Aversano L., 2007, Proceedings of the Foundations of Software Engineering, P19
[6] Beheshtian M. J., 2020, BATCHBUILDER GITHUB
[7] Beheshtian M. J., 2021, REPLICATION PACKAGE
[8] Oops, My Tests Broke the Build: An Explorative Analysis of Travis CI with GitHub
Beller, Moritz
Gousios, Georgios
Zaidman, Andy
[J]. 2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, : 356 - 367
[9] TravisTorrent: Synthesizing Travis CI and GitHub for Full-Stack Research on Continuous Integration
Beller, Moritz
Gousios, Georgios
Zaidman, Andy
[J]. 2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, : 447 - 450
[10] Dividing strategies for the optimization of a test suite
Chen, TY
Lau, MF
[J]. INFORMATION PROCESSING LETTERS, 1996, 60 (03) : 135 - 141

← 1 2 3 4 5 6 7 →