On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software Testing

被引：0

作者：

Perera, Anjana ^{[1
,2
]}

Turhan, Burak ^{[3
,4
]}

Aleti, Aldeida ^{[1
]}

Boehme, Marcel ^{[4
,5
]}

机构：

[1] Monash Univ, Fac Informat Technol, Wellington Rd, Melbourne, Vic 3800, Australia

[2] Oracle Labs, Brisbane, Qld, Australia

[3] Univ Oulu, Fac Informat Technol & Elect Engn, Pentti Kaiteran Katu 1,POB 3000, Oulu 90570, Finland

[4] Monash Univ, Melbourne, Vic, Australia

[5] Max Planck Inst Secur & Privacy, Univ Str 140, D-44799 Bochum, Germany

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2024年 / 33卷 / 06期

基金：

澳大利亚研究理事会;

关键词：

Search-based software testing; automated test generation; defect prediction; STATIC CODE ATTRIBUTES; MODELS; FIND;

D O I：

10.1145/3655022

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Defect predictors, static bug detectors, and humans inspecting the code can propose locations in the program that are more likely to be buggy before they are discovered through testing. Automated test generators such as search-based software testing (SBST) techniques can use this information to direct their search for test cases to likely buggy code, thus speeding up the process of detecting existing bugs in those locations. Often the predictions given by these tools or humans are imprecise, which can misguide the SBST technique and may deteriorate its performance. In this article, we study the impact of imprecision in defect prediction on the bug detection effectiveness of SBST. Our study finds that the recall of the defect predictor, i.e., the proportion of correctly identified buggy code, has a significant impact on bug detection effectiveness of SBST with a large effect size. More precisely, the SBST technique detects 7.5 fewer bugs on average (out of 420 bugs) for every 5% decrements of the recall. However, the effect of precision, a measure for false alarms, is not of meaningful practical significance, as indicated by a very small effect size. In the context of combining defect prediction and SBST, our recommendation is to increase the recall of defect predictors as a primary objective and precision as a secondary objective. In our experiments, we find that 75% precision is as good as 100% precision. To account for the imprecision of defect predictors, in particular low recall values, SBST techniques should be designed to search for test cases that also cover the predicted non-buggy parts of the program, while prioritising the parts that have been predicted as buggy.

引用

页数：27

共 50 条

[31] Search-Based Crash Reproduction and Its Impact on Debugging
Soltani, Mozhan
Panichella, Annibale
van Deursen, Arie
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2020, 46 (12) : 1294 - 1317
[32] SQL Data Generation to Enhance Search-Based System Testing
Arcuri, Andrea
Galeotti, Juan P.
PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'19), 2019, : 1390 - 1398
[33] Search-Based Software Engineering to Construct Binary Test-Suites
Torres-Jimenez, Jose
Avila-George, Himer
TRENDS AND APPLICATIONS IN SOFTWARE ENGINEERING, 2016, 405 : 201 - 212
[34] On the application of search-based techniques for software engineering predictive modeling: A systematic review and future directions
Malhotra, Ruchika
Khanna, Megha
Raje, Rajeev R.
SWARM AND EVOLUTIONARY COMPUTATION, 2017, 32 : 85 - 109
[35] Search-based fairness testing for regression-based machine learning systems
Anjana Perera
Aldeida Aleti
Chakkrit Tantithamthavorn
Jirayus Jiarpakdee
Burak Turhan
Lisa Kuhn
Katie Walker
Empirical Software Engineering, 2022, 27
[36] Search-based fairness testing for regression-based machine learning systems
Perera, Anjana
Aleti, Aldeida
Tantithamthavorn, Chakkrit
Jiarpakdee, Jirayus
Turhan, Burak
Kuhn, Lisa
Walker, Katie
EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (03)
[37] Basic block coverage for search-based unit testing and crash reproduction
Derakhshanfar, Pouria
Devroey, Xavier
Zaidman, Andy
EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (07)
[38] Basic block coverage for search-based unit testing and crash reproduction
Pouria Derakhshanfar
Xavier Devroey
Andy Zaidman
Empirical Software Engineering, 2022, 27
[39] Experience Paper: Search-based Testing in Automated Driving Control Applications
Gladisch, Christoph
Heinz, Thomas
Heinzemann, Christian
Oehlerking, Jens
von Vietinghoff, Anne
Pfitzer, Tim
34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 26 - 37
[40] Deep learning based software defect prediction
Qiao, Lei
Li, Xuesong
Umer, Qasim
Guo, Ping
NEUROCOMPUTING, 2020, 385 : 100 - 110

← 1 2 3 4 5 →