SwarmCNN: An efficient method for CNN hyperparameter optimization using PSO and ABC metaheuristic algorithms

被引:0
作者
Inik, Ozkan [1 ]
机构
[1] Tokat Gaziosmanpasa Univ, Dept Comp Engn, Tasliciftlik Campus, TR-60250 Tokat, Turkiye
关键词
ABC; CIFAR-10; Hyperparameter optimization; PSO; MNIST; Neural architecture search; NEURAL-NETWORK; ARCHITECTURES; SEARCH;
D O I
10.1007/s11227-025-07347-y
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have become very popular, as they can successfully solve problems in many areas by obtaining representations of input data at different layers with tuned hyperparameters. A CNN's hyperparameters include design parameters (DPs), which describe the depth of the CNN and order of layers; layer parameters (LPs), which are used for each CNN layer; and training parameters, which are used for training the CNN. The performance of CNNs depends on these hyperparameters, but setting them properly remains a very difficult and important problem. Although there are studies in the literature that optimize each of these three parameter groups separately, there is a lack of methodologies for simultaneous optimization of DPs and LPs in a nested framework. This study proposes a novel method called SwarmCNN, which combines particle swarm optimization and artificial bee colony algorithms to optimize both DPs and LPs. The effectiveness of SwarmCNN was evaluated across various datasets, including Mnist, Mnist RD, Mnist BN, Mnist BI, Mnist RD + BI, Convex, Rectangles, Mnist Fashion, and CIFAR-10. The results demonstrate promising accuracy rates: 99.58%, 96.20%, 97.56%, 96.39%, 83.39%, 96.92%, 100%, 93.47%, and 84.77%, respectively. Comparative analysis against numerous competitors revealed SwarmCNN's superiority on five datasets and its second-place ranking on four datasets. The results demonstrated that SwarmCNN emerges as a powerful and competitive solution for optimizing hyperparameters and conducting neural architecture searches with high accuracy on various datasets.
引用
收藏
页数:42
相关论文
共 89 条
[1]  
Agarap AF, 2017, arXiv, DOI [10.48550/arXiv.1712.03541, DOI 10.48550/ARXIV.1712.03541]
[2]   Efficient hardware design of spiking neurons and unsupervised learning module in large scale pattern classification network [J].
Amiri, Masoud ;
Nazari, Soheila .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[3]  
[Anonymous], 2007, An empirical evaluation of deep architectures on problems with many factors of variation, DOI [10.1145/1273496.1273556, 10.1145/1273496, DOI 10.1145/1273496]
[4]  
[Anonymous], 2011, P 28 INT C INT C MAC, DOI DOI 10.5555/3104482.3104587
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]  
Bhatnagar S, 2017, 2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), P357, DOI 10.1109/ICIIP.2017.8313740
[7]   TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE [J].
BLUM, AL ;
RIVEST, RL .
NEURAL NETWORKS, 1992, 5 (01) :117-127
[8]   Development of hybrid models based on deep learning and optimized machine learning algorithms for brain tumor Multi-Classification [J].
Celik, Muhammed ;
Inik, Ozkan .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[9]   PCANet: A Simple Deep Learning Baseline for Image Classification? [J].
Chan, Tsung-Han ;
Jia, Kui ;
Gao, Shenghua ;
Lu, Jiwen ;
Zeng, Zinan ;
Ma, Yi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5017-5032
[10]   PixelHop: A successive subspace learning (SSL) method for object recognition [J].
Chen, Yueru ;
Kuo, C. -C. Jay .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 70