Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

被引:34
作者
Deng, Lei [1 ]
Wu, Yujie [1 ]
Hu, Yifan [1 ]
Liang, Ling [2 ]
Li, Guoqi [1 ]
Hu, Xing [3 ]
Ding, Yufei [4 ]
Li, Peng [2 ]
Xie, Yuan [2 ]
机构
[1] Tsinghua Univ, Dept Precis Instrument, Ctr Brain Inspired Comp Res, Beijing 100084, Peoples R China
[2] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
[3] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
[4] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
基金
中国国家自然科学基金;
关键词
Neurons; Computational modeling; Quantization (signal); Optimization; Encoding; Task analysis; Synapses; Activity regularization; alternating direction method of multiplier (ADMM); connection pruning; spiking neural network (SNN) compression; weight quantization; SPIKING NEURAL-NETWORKS; CLASSIFICATION;
D O I
10.1109/TNNLS.2021.3109064
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency. Model compression has been proposed as a promising technique to improve the running efficiency via parameter and operation reduction, whereas this technique is mainly practiced in ANNs rather than SNNs. It is interesting to answer how much an SNN model can be compressed without compromising its functionality, where two challenges should be addressed: 1) the accuracy of SNNs is usually sensitive to model compression, which requires an accurate compression methodology and 2) the computation of SNNs is event-driven rather than static, which produces an extra compression dimension on dynamic spikes. To this end, we realize a comprehensive SNN compression through three steps. First, we formulate the connection pruning and weight quantization as a constrained optimization problem. Second, we combine spatiotemporal backpropagation (STBP) and alternating direction method of multipliers (ADMMs) to solve the problem with minimum accuracy loss. Third, we further propose activity regularization to reduce the spike events for fewer active operations. These methods can be applied in either a single way for moderate compression or a joint way for aggressive compression. We define several quantitative metrics to evaluate the compression performance for SNNs. Our methodology is validated in pattern recognition tasks over MNIST, N-MNIST, CIFAR10, and CIFAR100 datasets, where extensive comparisons, analyses, and insights are provided. To the best of our knowledge, this is the first work that studies SNN compression in a comprehensive manner by exploiting all compressible components and achieves better results.
引用
收藏
页码:2791 / 2805
页数:15
相关论文
共 71 条
  • [1] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [2] Ahmed K, 2016, IEEE IJCNN, P4286, DOI 10.1109/IJCNN.2016.7727759
  • [3] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
    Aimar, Alessandro
    Mostafa, Hesham
    Calabrese, Enrico
    Rios-Navarro, Antonio
    Tapiador-Morales, Ricardo
    Lungu, Iulia-Alexandra
    Milde, Moritz B.
    Corradi, Federico
    Linares-Barranco, Alejandro
    Liu, Shih-Chii
    Delbruck, Tobi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
  • [4] FPGA-Based Stochastic Echo State Networks for Time-Series Forecasting
    Alomar, Miquel L.
    Canals, Vincent
    Perez-Mora, Nicolas
    Martinez-Moll, Victor
    Rossello, Josep L.
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [5] YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration
    Andri, Renzo
    Cavigelli, Lukas
    Rossi, Davide
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (01) : 48 - 60
  • [6] [Anonymous], 2015, Neural Networks (IJCNN), 2015 International Joint Conference on
  • [7] [Anonymous], 2018, ARXIV180400227
  • [8] [Anonymous], 2016, P 31 ANN ACM S APPL
  • [9] [Anonymous], 2018, Adam-admm: A unified, systematic framework of structured weight pruning for dnns
  • [10] [Anonymous], 2015, Neural Networks (IJCNN), 2015 International Joint Conference on, IEEE, DOI [DOI 10.1109/IJCNN.2015.7280592, 10.1109/IJCNN.2015.7280592]