Automatic model selection for fully connected neural networks

被引:1
作者
Laredo D. [1 ]
Ma S.F. [2 ]
Leylaz G. [2 ]
Schütze O. [1 ]
Sun J.-Q. [2 ]
机构
[1] Department of Computer Science, CINVESTAV, Mexico City
[2] Department of Mechanical Engineering, University of California, Merced, 95343, CA
关键词
Artificial neural networks; Distributed computing; Evolutionary algorithms; Hyperparameter tuning; Model selection;
D O I
10.1007/s40435-020-00708-w
中图分类号
学科分类号
摘要
Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine tuning its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficiently selecting a neural network model for a selected task, whether it is classification or regression. The algorithm, named Automatic Model Selection, is a modified micro-genetic algorithm that automatically and efficiently finds the most suitable fully connected neural network model for a given dataset. The main contributions of this method are: a simple, list based encoding for neural networks, which will be used as the genotype in our evolutionary algorithm, novel crossover and mutation operators, the introduction of a fitness function that considers the accuracy of the neural network and its complexity, and a method to measure the similarity between two neural networks. AMS is evaluated on two different datasets. By comparing some models obtained with AMS to state-of-the-art models for each dataset we show that AMS can automatically find efficient neural network models. Furthermore, AMS is computationally efficient and can make use of distributed computing paradigms to further boost its performance. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:1063 / 1079
页数:16
相关论文
共 54 条
[1]  
Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G., Et al., Tensorflow: Large-Scale Machine Learning on Heterogeneous Systems. Software, (2015)
[2]  
Francois C., Keras, (2015)
[3]  
Jia Y., Shelhamer E., Donahue J., Karayev S., Long J., Et al., Caffe: Convolutional Architecture for Fast Feature Embedding, (2014)
[4]  
Seide F., Agarwal A., CNTK: Microsoft’s open-source deep-learning toolkit, 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2135-2135, (2016)
[5]  
Hall M., Frank E., Holmes G., Pfahringer B., Pand Witten R.I., The WEKA data mining software: an update, ACM SIGKDD Explor Newsl, 11, 1, pp. 10-18, (2009)
[6]  
Schaul T., Bayer J., Wierstra D., Sun Y., Felder M., Sehnke F., Rucksties T., Schmidhuber J., Pybrain, J Mach Learn Res, 11, pp. 743-746, (2010)
[7]  
Meng X., Bradley J., Yavuz B., Sparks B., Venkataraman S., Liu D., Et al., MLlib: machine learning in apache spark, J Mach Learn Res, 17, 34, pp. 1-7, (2016)
[8]  
Jin H., Song Q., Hu X., Auto-Keras: An Efficient Neural Architecture Search System, (2018)
[9]  
Real E., Aggarwal A., Huang Y., Le Q.V., Regularized Evolution for Image Classifier Architecture Search, (2018)
[10]  
Sparks E., Talwalkar A., Smith V., Kottalam J., Pan X., Gonzales J., Automated model search for large scale machine learning, Socc, pp. 368-380, (2015)