Scalable reinforcement learning-based neural architecture search

被引:3
作者
Cassimon, Amber [1 ]
Mercelis, Siegfried [1 ]
Mets, Kevin [1 ]
机构
[1] IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, Antwerp, Antwerp
关键词
AutoML; Deep learning; Neural architecture search; Reinforcement learning;
D O I
10.1007/s00521-024-10445-2
中图分类号
学科分类号
摘要
We assess the feasibility of a reusable neural architecture search agent aimed at amortizing the initial time-investment in building a good search strategy. We do this through the use of Reinforcement Learning, where an agent learns to iteratively select the best way to modify a given neural network architecture. This is achieved using a transformer-based agent design trained using the Ape-X algorithm. We consider both the NAS-Bench-101 and NAS-Bench-301 settings, and compare against various known strong baselines, such as local search and random search. While achieving competitive performance on both benchmarks, the amount of training required for the much larger NAS-Bench-301 is only marginally greater than NAS-Bench-101, illustrating the strong scaling properties of our agent. Our agent is able to achieve strong performance, but the choice of values for certain parameters are crucial to ensuring the succesful training of the agent. We provide some guidance for the selection of appropriate values for hyperparameters through a detailed description of our experimental setup and several ablation studies. © The Author(s) 2024.
引用
收藏
页码:231 / 261
页数:30
相关论文
共 48 条
[1]  
Asthana R., Conrad J., Dawoud Y., Et al., Multi-conditioned graph diffusion for neural architecture search, Tran Mach Learn Res, (2024)
[2]  
Ba J.L., Kiros J.R., Hinton G.E., Layer Normalization., (2016)
[3]  
Baker B., Gupta O., Naik N., Et al., Designing neural network architectures using reinforcement learning, International Conference on Learning Representations, (2017)
[4]  
Cai Z., Chen L., Liu H.L., Bhe-darts: Bilevel optimization based on hypergradient estimation for differentiable architecture search, ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5, (2023)
[5]  
Chu X., Zhou T., Zhang B., Et al., Fair darts: eliminating unfair advantages in differentiable architecture search, Computer vision—ECCV 2020, pp. 465-480, (2020)
[6]  
Den Ottelander T., Dushatskiy A., Virgolin M., Et al., Local search is a remarkably strong baseline for neural architecture search, Evolutionary multi-criterion optimization, pp. 465-479, (2021)
[7]  
Dosovitskiy A., Beyer L., Kolesnikov A., Et al., An image is worth 16 × 16 words: Transformers for image recognition at scale, International Conference on Learning Representations, (2021)
[8]  
Elsken T., Metzen J.H., Hutter F., Efficient multi-objective neural architecture search via Lamarckian evolution, International Conference on Learning Representations, (2019)
[9]  
Elsken T., Metzen J.H., Hutter F., Neural architecture search: a survey, J Mach Learn Res, 20, 55, pp. 1-21, (2019)
[10]  
Han F.X., Mills K.G., Chudak F., Et al., A general-purpose transferable predictor for neural architecture search, Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), SIAM, pp. 721-729, (2023)