Latent feature learning via autoencoder training for automatic classification configuration recommendation

被引:9
作者
Deng, Liping [1 ]
Xiao, MingQing [1 ]
机构
[1] Southern Illinois Univ Carbondale, Sch Math & Stat Sci, Carbondale, IL 62901 USA
关键词
Classifier and hyperparameter recommendation; Denoising autoencoder; Meta-learning; ALGORITHM; SEARCH;
D O I
10.1016/j.knosys.2022.110218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Combined Algorithm Selection and Hyperparameter Optimization problem, in short, CASH, seeks the most suitable classifiers and hyperparameters for the underlying classification problems. In current literature, the common approaches in dealing with CASH problem are conducted via search-based methods such as sequential model-based optimization (SMBO) along with various active tests. Different from current existing approaches, in this paper, we propose a new method by incorporating the so-called denoising autoencoder (DAE) approach into meta-learning (MtL) for automatic configuration (both algorithms and their hyperparameters) recommendation, which appears to be quite effective compared to standard search-based approaches. More specifically, we set up the configuration search space for CASH and produce the metadata, and generate the classification performance on a set of collected historical datasets. Then both encoder and decoder in the DAE system are trained with the masked metadata as inputs and the unmasked metadata as targets to extract the subtle latent variables of metadata and recover the unmasked inputs subsequently. Under our framework, the performance over the entire configuration space can be predicted effectively through two different settings, and the configuration with the highest predictive performance is thus recommended. The first recommendation approach is by inactivating some inputs and then to recover their entries via the trained encoder and decoder for new problems, while in the second approach, the relationship between the acquired latent variables and the meta-features of historical datasets via kernel multivariate multiple regression (MMR) is enacted, leading to the performance estimation of new datasets being pursued directly through MMR and the decoder of DAE without requiring any new configuration evaluations. An automatic classification configuration recommendation system, including 81 historical problems and 11 common classifiers with a total of 4983 configurations, is established to show the effectiveness of our proposed approach. The comparative results on 45 testing problems demonstrate that our proposed model has the superior recommendation capacity in terms of the baselines for existing MtL as well as other search-based approaches.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:16
相关论文
共 49 条
[1]   Speeding up algorithm selection using average ranking and active testing by introducing runtime [J].
Abdulrahman, Salisu Mamman ;
Brazdil, Pavel ;
van Rijn, Jan N. ;
Vanschoren, Joaquin .
MACHINE LEARNING, 2018, 107 (01) :79-108
[2]  
Alcobaça E, 2020, J MACH LEARN RES, V21
[3]  
[Anonymous], 2009, SIGKDD Explorations, DOI [10.1145/1656274.1656278, DOI 10.1145/1656274.1656278]
[4]  
[Anonymous], 2007, Machine learning: ECML 2001, DOI DOI 10.1007/3-540-44795-43
[5]   NEURAL NETWORKS AND PRINCIPAL COMPONENT ANALYSIS - LEARNING FROM EXAMPLES WITHOUT LOCAL MINIMA [J].
BALDI, P ;
HORNIK, K .
NEURAL NETWORKS, 1989, 2 (01) :53-58
[6]   Gradient-based optimization of hyperparameters [J].
Bengio, Y .
NEURAL COMPUTATION, 2000, 12 (08) :1889-1900
[7]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[8]   ASlib: A benchmark library for algorithm selection [J].
Bischl, Bernd ;
Kerschke, Pascal ;
Kotthoff, Lars ;
Lindauer, Marius ;
Malitsky, Yuri ;
Frechette, Alexandre ;
Hoos, Holger ;
Hutter, Frank ;
Leyton-Brown, Kevin ;
Tierney, Kevin ;
Vanschoren, Joaquin .
ARTIFICIAL INTELLIGENCE, 2016, 237 :41-58
[9]  
Bishop C.M., 1995, Neural Networks for Pattern Recognition
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32