Sample size estimation for power and accuracy in the experimental comparison of algorithms

被引:10
作者
Campelo, Felipe [1 ]
Takahashi, Fernanda [2 ]
机构
[1] Univ Fed Minas Gerais, Dept Elect Engn, BR-31270901 Belo Horizonte, MG, Brazil
[2] Univ Fed Minas Gerais, Grad Program Elect Engn, BR-31270901 Belo Horizonte, MG, Brazil
关键词
Experimental comparison of algorithms; Statistical methods; Sample size estimation; Accuracy of parameter estimation; Iterative sampling; EVOLUTIONARY ALGORITHMS; PERFORMANCE; OPTIMIZATION; TESTS; INTELLIGENCE; DESIGN;
D O I
10.1007/s10732-018-9396-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Experimental comparisons of performance represent an important aspect of research on optimization algorithms. In this work we present a methodology for defining the required sample sizes for designing experiments with desired statistical properties for the comparison of two methods on a given problem class. The proposed approach allows the experimenter to define desired levels of accuracy for estimates of mean performance differences on individual problem instances, as well as the desired statistical power for comparing mean performances over a problem class of interest. The method calculates the required number of problem instances, and runs the algorithms on each test instance so that the accuracy of the estimated differences in performance is controlled at the predefined level. Two examples illustrate the application of the proposed method, and its ability to achieve the desired statistical properties with a methodologically sound definition of the relevant sample sizes.
引用
收藏
页码:305 / 338
页数:34
相关论文
共 60 条
[11]  
Bartz-Beielstein T., 2006, EXPT RES EVOLUTIONAR
[12]  
Bartz-Beielstein T, 2005, THESIS
[13]  
Bartz-Beielstein T., 2010, Experimental methods for the analysis of optimization algorithms
[14]  
Bausell R.B., 2006, Power Analysis for Experimental Research: A Practical Guide for the Biological, Medical and Social Sciences
[15]  
Benavoli A, 2014, PR MACH LEARN RES, V32, P1026
[16]  
Birattari M., 2009, TUNING METAHEURISTIC
[17]   How to assess and report the performance of a stochastic algorithm on a benchmark problem: mean or best result on a number of runs? [J].
Birattari, Mauro ;
Dorigo, Marco .
OPTIMIZATION LETTERS, 2007, 1 (03) :309-311
[18]   Optimization of sample size in controlled experiments:: The CLAST rule [J].
Botella, Juan ;
Ximenez, Carmen ;
Revuelta, Javier ;
Suero, Manuel .
BEHAVIOR RESEARCH METHODS, 2006, 38 (01) :65-76
[19]   A Multicriteria Statistical Based Comparison Methodology for Evaluating Evolutionary Algorithms [J].
Carrano, Eduardo G. ;
Wanner, Elizabeth F. ;
Takahashi, Ricardo H. C. .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2011, 15 (06) :848-870
[20]  
Chow SC., 2007, SAMPLE SIZE CALCULAT