Recent trends in the use of statistical tests for comparing swarm and evolutionary computing algorithms: Practical guidelines and a critical review

被引:402
作者
Carrasco, J. [1 ]
Garcia, S. [1 ]
Rueda, M. M. [2 ]
Das, S. [3 ]
Herrera, F. [1 ]
机构
[1] Univ Granada, Andalusian Res Inst Data Sci & Computat Intellige, Dept Comp Sci & AI, Granada, Spain
[2] Univ Granada, Andalusian Res Inst Math, Dept Stat & Operat Res, Granada, Spain
[3] Indian Stat Inst, Elect & Commun Sci Unit, 203 BT Rd, Kolkata 700108, W Bengal, India
关键词
Statistical tests; Optimisation; Parametric; Non-parametric; Bayesian; CONFIDENCE-INTERVALS; MULTIPLE COMPARISONS; SAMPLE-SIZE; PERFORMANCE; OPTIMIZATION; INTELLIGENCE; CLASSIFIERS; INFERENCE; ACCURACY; DESIGN;
D O I
10.1016/j.swevo.2020.100665
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key aspect of the design of evolutionary and swarm intelligence algorithms is studying their performance. Statistical comparisons are also a crucial part which allows for reliable conclusions to be drawn. In the present paper we gather and examine the approaches taken from different perspectives to summarise the assumptions made by these statistical tests, the conclusions reached and the steps followed to perform them correctly. In this paper, we conduct a survey on the current trends of the proposals of statistical analyses for the comparison of algorithms of computational intelligence and include a description of the statistical background of these tests. We illustrate the use of the most common tests in the context of the Competition on single-objective real parameter optimisation of the IEEE Congress on Evolutionary Computation (CEC) 2017 and describe the main advantages and drawbacks of the use of each kind of test and put forward some recommendations concerning their use.
引用
收藏
页数:20
相关论文
共 100 条
[51]   On the statistical analysis of the parameters' trend in a machine learning algorithm [J].
García S. ;
Derrac J. ;
Ramírez-Gallego S. ;
Herrera F. .
García, S. (sglopez@ujaen.es), 1600, Springer Verlag (03) :51-53
[52]   Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power [J].
Garcia, Salvador ;
Fernandez, Alberto ;
Luengo, Julian ;
Herrera, Francisco .
INFORMATION SCIENCES, 2010, 180 (10) :2044-2064
[53]  
García S, 2008, J MACH LEARN RES, V9, P2677
[54]   A study on the use of non-parametric tests for analyzing the evolutionary algorithms' behaviour: a case study on the CEC'2005 Special Session on Real Parameter Optimization [J].
Garcia, Salvador ;
Molina, Daniel ;
Lozano, Manuel ;
Herrera, Francisco .
JOURNAL OF HEURISTICS, 2009, 15 (06) :617-644
[55]  
Gelman A., 2003, Bayesian Data Analysis
[56]  
Gibbons JD, 2010, STAT TXB MONOGRAPHS, V15th, DOI DOI 10.5005/JP/BOOKS/10313_14
[57]   Benchmarking evolutionary algorithms for single objective real-valued constrained optimization - A critical review [J].
Hellwig, Michael ;
Beyer, Hans-Georg .
SWARM AND EVOLUTIONARY COMPUTATION, 2019, 44 :927-944
[58]   RANK METHODS FOR COMBINATION OF INDEPENDENT EXPERIMENTS IN ANALYSIS OF VARIANCE [J].
HODGES, JL ;
LEHMANN, EL .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :482-+
[59]   APPROXIMATIONS OF THE CRITICAL REGION OF THE FRIEDMAN STATISTIC [J].
IMAN, RL ;
DAVENPORT, JM .
COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1980, 9 (06) :571-595
[60]   Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies [J].
Irsoy, Ozan ;
Yildiz, Olcay Taner ;
Alpaydin, Ethem .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (06) :1663-1675