Neural Architecture Search Based on a Multi-Objective Evolutionary Algorithm With Probability Stack

被引:55
作者
Xue, Yu [1 ]
Chen, Chen [1 ]
Slowik, Adam [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Software, Nanjing 210044, Peoples R China
[2] Koszalin Univ Technol, Dept Elect & Comp Sci, Koszalin PL-75453, Poland
基金
中国国家自然科学基金;
关键词
Deep learning; evolutionary computation; multiobjective optimization; neural architecture search (NAS);
D O I
10.1109/TEVC.2023.3252612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of deep neural networks, many research fields, such as image classification, object detection, speech recognition, natural language processing, machine translation, and automatic driving, have made major breakthroughs in technology and the research achievements have been successfully applied in many real-life applications. Combining evolutionary computation and neural architecture search (NAS) is an important approach to improve the performance of deep neural networks. Usually, the related researchers only focus on precision. Thus, the searched neural architectures always perform poorly in the other indexes such as time cost. In this article, a multi-objective evolutionary algorithm with a probability stack (MOEA-PS) is proposed for NAS, which considers the two objects of precision and time consumption. MOEA-PS uses an adjacency list to represent the internal structure of deep neural networks. Besides, a unique mechanism is introduced into the multi-objective genetic algorithm to guide the process of crossover and mutation when generating offspring. Furthermore, the structure blocks are stacked using a proxy model to generate deep neural networks. The results of the experiments on Cifar-10 and Cifar-100 demonstrate that the proposed algorithm has a similar error rate compared with the most advanced NAS algorithms, but the time cost is lower. Finally, the network structure searched on Cifar-10 is transferred directly to the ImageNet dataset, which can achieve 73.6% classification accuracy.
引用
收藏
页码:778 / 786
页数:9
相关论文
共 51 条
[1]  
Baker B, 2017, Arxiv, DOI arXiv:1611.02167
[2]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[3]  
Bi MX, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P3259
[4]  
Cai H., 2019, ICLR
[5]  
Deb K., 2000, Parallel Problem Solving from Nature PPSN VI. 6th International Conference. Proceedings (Lecture Notes in Computer Science Vol.1917), P849
[6]  
Elsken T., 2018, P 6 INT C LEARN REPR, P1
[7]  
Elsken T, 2017, Arxiv, DOI arXiv:1711.04528
[8]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[9]  
Hsu CH, 2018, Arxiv, DOI [arXiv:1806.10332, DOI 10.48550/ARXIV.1806.10332]
[10]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]