Bare-Bones particle Swarm optimization-based quantization for fast and energy efficient convolutional neural networks

被引:3
作者
Tmamna, Jihene [1 ]
Ben Ayed, Emna [1 ]
Fourati, Rahma [1 ,2 ,5 ]
Hussain, Amir [3 ]
Ben Ayed, Mounir [1 ,4 ]
机构
[1] Natl Engn Sch Sfax ENIS, Univ Sfax, Res Grp Intelligent Machines, Sfax, Tunisia
[2] Univ Jendouba, Fac Law Econ & Management Sci Jendouba FSJEGJ, Jendouba, Tunisia
[3] Edinburgh Napier Univ, Sch Comp, Edinburgh, Scotland
[4] Univ Sfax, Fac Sci Sfax, Comp Sci & Commun Dept, Sfax, Tunisia
[5] Natl Engn Sch Sfax ENIS, Univ Sfax, Res Grp Intelligent Machines, Sfax 3038, Tunisia
基金
英国工程与自然科学研究理事会;
关键词
Barebone PSO; energy efficient model inference; mixed precision quantization; model compression;
D O I
10.1111/exsy.13522
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network quantization is a critical method for reducing memory usage and computational complexity in deep learning models, making them more suitable for deployment on resource-constrained devices. In this article, we propose a method called BBPSO-Quantizer, which utilizes an enhanced Bare-Bones Particle Swarm Optimization algorithm, to address the challenging problem of mixed precision quantization of convolutional neural networks (CNNs). Our proposed algorithm leverages a new population initialization, a robust screening process, and a local search strategy to improve the search performance and guide the population towards a feasible region. Additionally, Deb's constraint handling method is incorporated to ensure that the optimized solutions satisfy the functional constraints. The effectiveness of our BBPSO-Quantizer is evaluated on various state-of-the-art CNN architectures, including VGG, DenseNet, ResNet, and MobileNetV2, using CIFAR-10, CIFAR-100, and Tiny ImageNet datasets. Comparative results demonstrate that our method delivers an excellent tradeoff between accuracy and computational efficiency.
引用
收藏
页数:21
相关论文
共 58 条
  • [1] Bablani D., 2023, ARXIV
  • [2] Deep Learning with Low Precision by Half-wave Gaussian Quantization
    Cai, Zhaowei
    He, Xiaodong
    Sun, Jian
    Vasconcelos, Nuno
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5406 - 5414
  • [3] Hardware-Friendly Logarithmic Quantization with Mixed-Precision for MobileNetV2
    Choi, Dahun
    Kim, Hyun
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 348 - 351
  • [4] Choi J., 2018, Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies
  • [5] Courbariaux M, 2015, ADV NEUR IN, V28
  • [6] An efficient constraint handling method for genetic algorithms
    Deb, K
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2000, 186 (2-4) : 311 - 338
  • [7] Devlin J., 2018, BERT PRETRAINING DEE
  • [8] HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
    Dong, Zhen
    Yao, Zhewei
    Gholami, Amir
    Mahoney, Michael W.
    Keutzer, Kurt
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 293 - 302
  • [9] Dong Zhen, 2020, ADV NEUR IN, V33
  • [10] Esser Steven K, 2019, LEARNED STEP SIZE QU