Convolutional neural network architecture search based on fractal decomposition optimization algorithm

被引：6

作者：

Souquet, Leo ^{[1
]}

Shvai, Nadiya ^{[1
]}

Llanza, Arcadi ^{[1
,2
]}

Nakib, Amir ^{[2
]}

机构：

[1] Vinci Autoroutes, Nanterre, France

[2] Univ Paris Est Creteil, Lab LISSI, 122 Rue Paul Armangot, F-94400 Vitry Sur Seine, France

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2023年 / 213卷

关键词：

Neural architecture search; Hyperparameters optimization; Fractal decomposition; CNN;

D O I：

10.1016/j.eswa.2022.118947

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a new approach to design the architecture and optimize the hyperparameters of a deep convolutional neural network (CNN) via of the Fractal Decomposition Algorithm (FDA). This optimization algorithm was recently proposed to solve continuous optimization problems. It is based on a geometric fractal decomposition that divides the search space while searching for the best solution possible. As FDA is effective in single-objective optimization, in this work we aim to prove that it can also be successfully applied to fine-tuning deep neural network architectures.Moreover, a new formulation based on bi-level optimization is proposed to separate the architecture search composed of discrete parameters from hyperparameters' optimization. This is motivated by the fact that automating the construction of deep neural architecture has been an important focus over recent years as manual construction is considerably time-consuming, error-prone, and requires in-depth knowledge. To solve the bi-level problem thus formulated, a random search is performed aiming to create a set of candidate architectures. Then, the best ones are finetuned using FDA. CIFAR-10 and CIFAR-100 benchmarks were used to evaluate the performance of the proposed approach. The results obtained are among the state of the art in the corresponding class of networks (low number of parameters and chained-structured CNN architectures). The results are emphasized by the fact that the whole process was performed using low computing power with only 3 NVIDIA V100 GPUs. The source code is available at https://github.com/alc1218/Convolutional-Neural-Network-Architecture-Search-Based-on-Fractal-Decomposition-Optimization.

引用

页数：13

共 43 条

[1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2] Baker B., 2017, ICLR
[3] On the automated, evolutionary design of neural networks: past, present, and future
Baldominos, Alejandro
Saez, Yago
Isasi, Pedro
[J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (02) : 519 - 545
[4] TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE
BLUM, AL
RIVEST, RL
[J]. NEURAL NETWORKS, 1992, 5 (01) : 117 - 127
[5] Chollet F, 2015, KERAS
[6] A survey of swarm and evolutionary computing approaches for deep learning
Darwish, Ashraf
Hassanien, Aboul Ella
Das, Swagatam
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (03) : 1767 - 1812
[7] Domhan T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3460
[8] Elsken T, 2019, J MACH LEARN RES, V20
[9] Galvan E., 2020, NEUROEVOLUTION DEEP
[10] What size test set gives good error rate estimates?
Guyon, I
Makhoul, J
Schwartz, R
Vapnik, V
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (01) : 52 - 64

← 1 2 3 4 5 →