Enhancing CNN structure and learning through NSGA-II-based multi-objective optimization

被引:4
作者
Elghazi, Khalid [1 ]
Ramchoun, Hassan [2 ]
Masrour, Tawfik [1 ,3 ]
机构
[1] Moulay Ismail Univ, Natl Sch Arts & Crafts, Lab Math Modeling Simulat & Smart Syst L2M3S, Meknes, Morocco
[2] Univ Moulay Ismail, Natl Sch Business & Management, Lab Math Modeling Simulat & Smart Syst L2M3S, Meknes, Morocco
[3] Univ Quebec Rimouski, Math Comp Sci & Engn Dept, Rimouski, PQ, Canada
关键词
Convolutional neural network; Neural architecture search; Multiobjective optimization; NSGA-II; Image classification; Model complexity; NEURAL-NETWORK; ARCHITECTURES;
D O I
10.1007/s12530-024-09574-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the advancement of convolutional neural networks (CNNs) has been driven by the pursuit of higher classification accuracy in image tasks. However, achieving optimal performance often requires extensive manual design, incorporating domain-specific knowledge and problem-understanding. This approach often results in highly complex network architectures, overlooking the potential drawbacks of such complexity. To this end, we propose MOGA-CNN, a Multi-Objective Genetic Algorithm for CNN structure that treats the CNN architecture design as a bi-objective optimization problem. MOGA-CNN aims to simultaneously optimize classification accuracy and minimize computational complexity, as measured by the number of learnable parameters. We employ the NSGA-II algorithm to effectively explore the trade-offs between these two conflicting objectives. The main contribution of this paper is the development of an encoding mechanism that captures the essential hyperparameters that influence CNN architecture, including the fully connected layer. To evaluate the effectiveness of our proposed algorithm, we conducted extensive experiments on four datasets, comparing its performance against other state-of-the-art methods. The results consistently demonstrate that our approach achieves satisfactory results when compared to these approaches.
引用
收藏
页码:1503 / 1519
页数:17
相关论文
共 44 条
[1]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[2]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[3]   Handling multiple objectives with particle swarm optimization [J].
Coello, CAC ;
Pulido, GT ;
Lechuga, MS .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2004, 8 (03) :256-279
[4]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[5]   DPP-Net: Device-Aware Progressive Search for Pareto-Optimal Neural Architectures [J].
Dong, Jin-Dong ;
Cheng, An-Chieh ;
Juan, Da-Cheng ;
Wei, Wei ;
Sun, Min .
COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :540-555
[6]  
Dong Jin-Dong., 2018, PROC 6 INT C LEARN R
[7]   Multiobjective bilevel optimization [J].
Eichfelder, Gabriele .
MATHEMATICAL PROGRAMMING, 2010, 123 (02) :419-449
[8]   Particle swarm optimization of deep neural networks architectures for image classification [J].
Fernandes Junior, Francisco Erivaldo ;
Yen, Gary G. .
SWARM AND EVOLUTIONARY COMPUTATION, 2019, 49 :62-74
[9]  
Glorot Xavier., 2011, Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR WCP, V15, P315
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778