Hyper-Parameter Optimization for Deep Learning by Surrogate-based Model with Weighted Distance Exploration

被引:0
作者
Li, Zhenhua [1 ]
Shoemaker, Christine A. [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Sch Comp Sci & Technol, Nanjing, Peoples R China
[2] Natl Univ Singapore, Dept Ind Syst Engn, Singapore, Singapore
来源
2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021) | 2021年
关键词
hyper-parameter optimization; deep neural networks; surrogate optimization; radial basis function; GLOBAL OPTIMIZATION; SEARCH;
D O I
10.1109/CEC45853.2021.9504777
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To improve deep neural net hyper-parameter optimization we develop a deterministic surrogate optimization algorithm as an efficient alternative to Bayesian optimization. A deterministic Radial Basis Function (RBF) surrogate model is built to interpolate previously evaluated points, and this surrogate model is incrementally updated in each iteration. The stochastic algorithm CMA-ES is used to search the acquisition function based on the surrogate. The acquisition function at a point is based on a weighted average of the surrogate at x and the minimum distance from x to a previously evaluated point. We evaluate the proposed algorithm RBF-CMA on hyper-parameter optimization tasks for deep convolutional neural networks on datasets of CIFAR-10, SVHN, and CIFAR-100. We show that RBF-CMA achieves a promising performance especially when the search space dimension is high in comparison to other algorithms including GP-EI, GP-LCB, and SMBO.
引用
收藏
页码:917 / 925
页数:9
相关论文
共 28 条
[1]   Theoretical Foundation for CMA-ES from Information Geometry Perspective [J].
Akimoto, Youhei ;
Nagata, Yuichi ;
Ono, Isao ;
Kobayashi, Shigenobu .
ALGORITHMICA, 2012, 64 (04) :698-716
[2]   LINEAR CONVERGENCE OF COMPARISON-BASED STEP-SIZE ADAPTIVE RANDOMIZED SEARCH VIA STABILITY OF MARKOV CHAINS [J].
Auger, Anne ;
Hansen, Nikolaus .
SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (03) :1589-1624
[3]  
Bergstra J., 2011, ADV NEURAL INFORM PR, V24, P1
[4]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[5]  
Gonzalez J., 2017, P 34 INT C MACH LEAR
[6]  
Hansen Nikolaus, 2016, The CMA Evolution Strategy: A Tutorial
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]  
Hernández-Lobato JM, 2014, ADV NEUR IN, V27
[9]  
Hutter Frank, 2011, Learning and Intelligent Optimization. 5th International Conference, LION 5. Selected Papers, P507, DOI 10.1007/978-3-642-25566-3_40
[10]  
Ilievski I, 2016, AAAI C ART INT, P918