Comparison-Based Natural Gradient Optimization in High Dimension

被引：31

作者：

Akimoto, Youhei ^{[1
]}

Auger, Anne ^{[2
]}

Hansen, Nikolaus ^{[2
]}

机构：

[1] Shinshu Univ, Fac Engn, Tokyo, Japan

[2] INRIA, Res Ctr Saclay Ile de France, Tokyo, Japan

来源：

GECCO'14: PROCEEDINGS OF THE 2014 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE | 2014年

关键词：

Covariance Matrix Adaptation; Natural Gradient; Hessian Matrix; Information Geometric Optimization; Theory; ADAPTATION; TIME;

D O I：

10.1145/2576768.2598258

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a novel natural gradient based stochastic search algorithm, VD-CMA, for the optimization of high dimensional numerical functions. The algorithm is comparisonbased and hence invariant to monotonic transformations of the objective function. It adapts a multivariate normal distribution with a restricted covariance matrix with twice the dimension as degrees of freedom, representing an arbitrarily oriented long axis and additional axis-parallel scaling. We derive the different components of the algorithm and show linear internal time and space complexity. We find empirically that the algorithm adapts its covariance matrix to the inverse Hessian on convex-quadratic functions with an Hessian with one short axis and different scaling on the diagonal. We then evaluate VD-CMA on test functions and compare it to different methods. On functions covered by the internal model of VD-CMA and on the Rosenbrock function, VD-CMA outperforms CMA-ES (having quadratic internal time and space complexity) not only in internal complexity but also in number of function calls with increasing dimension.

引用

页码：373 / 380

页数：8

共 21 条

[1] Theoretical Foundation for CMA-ES from Information Geometry Perspective
Akimoto, Youhei
Nagata, Yuichi
Ono, Isao
Kobayashi, Shigenobu
[J]. ALGORITHMICA, 2012, 64 (04) : 698 - 716
[2] Akimoto Y, 2010, LECT NOTES COMPUT SC, V6238, P154, DOI 10.1007/978-3-642-15844-5_16
[3] Adaptive method of realizing natural gradient learning for multilayer perceptrons
Amari, S
Park, H
Fukumizu, K
[J]. NEURAL COMPUTATION, 2000, 12 (06) : 1399 - 1409
[4] Natural gradient works efficiently in learning
Amari, S
[J]. NEURAL COMPUTATION, 1998, 10 (02) : 251 - 276
[5] [Anonymous], 2010, P 12 ANN C GENETIC E
[6] [Anonymous], 1995, P 6 INT C GEN ALG
[7] [Anonymous], 2010, P ADV NEUR INF PROC
[8] A tutorial on the cross-entropy method
De Boer, PT
Kroese, DP
Mannor, S
Rubinstein, RY
[J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
[9] Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)
Hansen, N
Muller, SD
Koumoutsakos, P
[J]. EVOLUTIONARY COMPUTATION, 2003, 11 (01) : 1 - 18
[10] Completely derandomized self-adaptation in evolution strategies
Hansen, N
Ostermeier, A
[J]. EVOLUTIONARY COMPUTATION, 2001, 9 (02) : 159 - 195

← 1 2 3 →