Parameter tuning or default values? An empirical investigation in search-based software engineering

被引:0
作者
Andrea Arcuri
Gordon Fraser
机构
[1] Certus Software V&V Center at Simula Research Laboratory,Department of Computer Science
[2] University of Sheffield,undefined
来源
Empirical Software Engineering | 2013年 / 18卷
关键词
Search-based software engineering; Test data generation; Object-oriented; Unit testing; Tuning; EvoSuite; Java; Response surface; Design of experiments;
D O I
暂无
中图分类号
学科分类号
摘要
Many software engineering problems have been addressed with search algorithms. Search algorithms usually depend on several parameters (e.g., population size and crossover rate in genetic algorithms), and the choice of these parameters can have an impact on the performance of the algorithm. It has been formally proven in the No Free Lunch theorem that it is impossible to tune a search algorithm such that it will have optimal settings for all possible problems. So, how to properly set the parameters of a search algorithm for a given software engineering problem? In this paper, we carry out the largest empirical analysis so far on parameter tuning in search-based software engineering. More than one million experiments were carried out and statistically analyzed in the context of test data generation for object-oriented software using the EvoSuite tool. Results show that tuning does indeed have impact on the performance of a search algorithm. But, at least in the context of test data generation, it does not seem easy to find good settings that significantly outperform the “default” values suggested in the literature. This has very practical value for both researchers (e.g., when different techniques are compared) and practitioners. Using “default” values is a reasonable and justified choice, whereas parameter tuning is a long and expensive process that might or might not pay off in the end.
引用
收藏
页码:594 / 623
页数:29
相关论文
共 18 条
[1]  
Ali S(2010)A systematic review of the application and empirical investigation of search-based test-case generation IEEE Trans Softw Eng 36 742-762
[2]  
Briand L(2012)A theoretical and empirical analysis of the role of test sequence length in software testing for structural coverage IEEE Trans Softw Eng 38 497-519
[3]  
Hemmati H(2013)Whole test suite generation IEEE Trans Softw Eng 39 276-291
[4]  
Panesar-Walawege R(1993)A methodology for controlling the size of a test suite ACM Trans Softw Eng Methodol 2 270-285
[5]  
Arcuri A(2005)Why most published research findings are false PLoS Med 2 124-555
[6]  
Fraser G(2003)Negative results: null and void Nature 422 554-17
[7]  
Arcuri A(2009)Response-surface methods in R, using RSM J Stat Softw 32 1-156
[8]  
Harrold MJ(2004)Search-based software test data generation: a survey Softw Test Verif Reliab 14 105-232
[9]  
Gupta R(1964)The importance of negative results in psychological research Can Psychol 5 225-831
[10]  
Soffa ML(2001)An overview of evolutionary algorithms: practical issues and common pitfalls Inf Softw Technol 43 817-82