Learning Hyperparameter Optimization Initializations

被引:0
作者
Wistuba, Martin [1 ]
Schilling, Nicolas [1 ]
Schmidt-Thieme, Lars [1 ]
机构
[1] Univ Hildesheim, Informat Syst & Machine Learning Lab, Hildesheim, Germany
来源
PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015) | 2015年
关键词
GLOBAL OPTIMIZATION; SEARCH;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hyperparameter optimization is often done manually or by using a grid search. However, recent research has shown that automatic optimization techniques are able to accelerate this optimization process and find hyperparameter configurations that lead to better models. Currently, transferring knowledge from previous experiments to a new experiment is of particular interest because it has been shown that it allows to further improve the hyperparameter optimization. We propose to transfer knowledge by means of an initialization strategy for hyperparameter optimization. In contrast to the current state of the art initialization strategies, our strategy is neither limited to hyperparameter configurations that have been evaluated on previous experiments nor does it need meta-features. The initial hyperparameter configurations are derived by optimizing for a meta-loss formally defined in this paper. This loss depends on the hyperparameter response function of the data sets that were investigated in past experiments. Since this function is unknown and only few observations are given, the meta-loss is not differentiable. We propose to approximate the response function by a differentiable plug-in estimator. Then, we are able to learn the initial hyperparameter configuration sequence by applying gradient-based optimization techniques. Extensive experiments are conducted on two meta-data sets. Our initialization strategy is compared to the state of the art for initialization strategies and further methods that are able to transfer knowledge between data sets. We give empirical evidence that our work provides an improvement over the state of the art.
引用
收藏
页码:339 / 348
页数:10
相关论文
共 25 条
  • [1] Bardenet Remi, 2013, JMLR WORKSH C P, P199, DOI DOI 10.5555/3042817.3042916
  • [2] Benbouzid D, 2012, J MACH LEARN RES, V13, P549
  • [3] Bergstra J., 2011, P 24 INT C NEURAL IN, DOI DOI 10.5555/2986459.2986743
  • [4] Bergstra J, 2012, J MACH LEARN RES, V13, P281
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] Feurer M., 2014, Proceedings of the 2014 International Conference on Meta-Learning and Algorithm Selection-Volume, V1201, P3
  • [7] Combining meta-learning and search techniques to select parameters for support vector machines
    Gomes, Taciana A. F.
    Prudencio, Ricardo B. C.
    Soares, Carlos
    Rossi, Andre L. D.
    Carvalho, Andre
    [J]. NEUROCOMPUTING, 2012, 75 (01) : 3 - 13
  • [8] Learning Time-Series Shapelets
    Grabocka, Josif
    Schilling, Nicolas
    Wistuba, Martin
    Schmidt-Thieme, Lars
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 392 - 401
  • [9] A fast learning algorithm for deep belief nets
    Hinton, Geoffrey E.
    Osindero, Simon
    Teh, Yee-Whye
    [J]. NEURAL COMPUTATION, 2006, 18 (07) : 1527 - 1554
  • [10] Hutter Frank, 2011, Learning and Intelligent Optimization. 5th International Conference, LION 5. Selected Papers, P507, DOI 10.1007/978-3-642-25566-3_40