Bare-Bones particle Swarm optimization-based quantization for fast and energy efficient convolutional neural networks

被引：3

作者：

Tmamna, Jihene ^{[1
]}

Ben Ayed, Emna ^{[1
]}

Fourati, Rahma ^{[1
,2
,5
]}

Hussain, Amir ^{[3
]}

Ben Ayed, Mounir ^{[1
,4
]}

机构：

[1] Natl Engn Sch Sfax ENIS, Univ Sfax, Res Grp Intelligent Machines, Sfax, Tunisia

[2] Univ Jendouba, Fac Law Econ & Management Sci Jendouba FSJEGJ, Jendouba, Tunisia

[3] Edinburgh Napier Univ, Sch Comp, Edinburgh, Scotland

[4] Univ Sfax, Fac Sci Sfax, Comp Sci & Commun Dept, Sfax, Tunisia

[5] Natl Engn Sch Sfax ENIS, Univ Sfax, Res Grp Intelligent Machines, Sfax 3038, Tunisia

来源：

EXPERT SYSTEMS | 2024年 / 41卷 / 04期

基金：

英国工程与自然科学研究理事会;

关键词：

Barebone PSO; energy efficient model inference; mixed precision quantization; model compression;

D O I：

10.1111/exsy.13522

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural network quantization is a critical method for reducing memory usage and computational complexity in deep learning models, making them more suitable for deployment on resource-constrained devices. In this article, we propose a method called BBPSO-Quantizer, which utilizes an enhanced Bare-Bones Particle Swarm Optimization algorithm, to address the challenging problem of mixed precision quantization of convolutional neural networks (CNNs). Our proposed algorithm leverages a new population initialization, a robust screening process, and a local search strategy to improve the search performance and guide the population towards a feasible region. Additionally, Deb's constraint handling method is incorporated to ensure that the optimized solutions satisfy the functional constraints. The effectiveness of our BBPSO-Quantizer is evaluated on various state-of-the-art CNN architectures, including VGG, DenseNet, ResNet, and MobileNetV2, using CIFAR-10, CIFAR-100, and Tiny ImageNet datasets. Comparative results demonstrate that our method delivers an excellent tradeoff between accuracy and computational efficiency.

引用

页数：21

共 58 条

[1] Bablani D., 2023, ARXIV
[2] Deep Learning with Low Precision by Half-wave Gaussian Quantization
Cai, Zhaowei
He, Xiaodong
Sun, Jian
Vasconcelos, Nuno
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5406 - 5414
[3] Hardware-Friendly Logarithmic Quantization with Mixed-Precision for MobileNetV2
Choi, Dahun
Kim, Hyun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 348 - 351
[4] Choi J., 2018, Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies
[5] Courbariaux M, 2015, ADV NEUR IN, V28
[6] An efficient constraint handling method for genetic algorithms
Deb, K
[J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2000, 186 (2-4) : 311 - 338
[7] Devlin J., 2018, BERT PRETRAINING DEE
[8] HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Dong, Zhen
Yao, Zhewei
Gholami, Amir
Mahoney, Michael W.
Keutzer, Kurt
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 293 - 302
[9] Dong Zhen, 2020, ADV NEUR IN, V33
[10] Esser Steven K, 2019, LEARNED STEP SIZE QU

← 1 2 3 4 5 6 →