Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks

被引:19
|
作者
Chowdhury, Sayeed Shafayet [1 ]
Garg, Isha [1 ]
Roy, Kaushik [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47905 USA
基金
美国国家科学基金会;
关键词
SNN; temporal pruning; latency; spike rate; quantization; accuracy;
D O I
10.1109/IJCNN52387.2021.9534111
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spiking Neural Networks (SNNs) are a promising alternative to traditional deep learning methods since they perform event-driven information processing. However, a major drawback of SNNs is high inference latency. The efficiency of SNNs could be enhanced using compression methods such as pruning and quantization. Notably, SNNs, unlike their non-spiking counterparts, consist of a temporal dimension, the compression of which can lead to latency reduction. In this paper, we propose spatial and temporal pruning of SNNs. First, structured spatial pruning is performed by determining the layer-wise significant dimensions using principal component analysis of the average accumulated membrane potential of the neurons. This step leads to 10-14X model compression. Additionally, it enables inference with lower latency and decreases the spike count per inference. To further reduce latency, temporal pruning is performed by gradually reducing the timesteps while training. The networks are trained using surrogate gradient descent based backpropagation and we validate the results on CIFAR10 and CIFAR100, using VGG architectures. The spatio-temporally pruned SNNs achieve 89.04% and 66.4% accuracy on CIFAR10 and CIFAR100, respectively, while performing inference with 3-30X reduced latency compared to state-of-the-art SNNs. Moreover, they require 8-14X lesser compute energy compared to their unpruned standard deep learning counterparts. The energy numbers are obtained by multiplying the number of operations with energy per operation. These SNNs also provide 1-4% higher robustness against Gaussian noise corrupted inputs. Furthermore, we perform weight quantization and find that performance remains reasonably stable up to 5-bit quantization.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Shrinking Your TimeStep: Towards Low-Latency Neuromorphic Object Recognition with Spiking Neural Networks
    Ding, Yongqi
    Zuo, Lin
    Jing, Mengmeng
    He, Pei
    Xiao, Yongjun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11811 - 11819
  • [32] STSC-SNN: Spatio-Temporal Synaptic Connection with temporal convolution and attention for spiking neural networks
    Yu, Chengting
    Gu, Zheming
    Li, Da
    Wang, Gaoang
    Wang, Aili
    Li, Erping
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [33] Heterogeneous recurrent spiking neural network for spatio-temporal classification
    Chakraborty, Biswadeep
    Mukhopadhyay, Saibal
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [34] Dynamics of spatio-temporal patterns in associative networks of spiking neurons
    Wennekers, T
    NEUROCOMPUTING, 2000, 32 : 597 - 602
  • [35] Analysis of spatio-temporal patterns in associative networks of spiking neurons
    Wennekers, T
    NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 245 - 250
  • [36] DPSNN: spiking neural network for low-latency streaming speech enhancement
    Sun, Tao
    Bohte, Sander
    NEUROMORPHIC COMPUTING AND ENGINEERING, 2024, 4 (04):
  • [37] Training Low-Latency Spiking Neural Network through Knowledge Distillation
    Takuya, Sugahara
    Zhang, Renyuan
    Nakashima, Yasuhiko
    2021 IEEE COOL CHIPS 24: IEEE SYMPOSIUM IN LOW-POWER AND HIGH-SPEED CHIPS, 2021,
  • [38] Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
    Claudionor N. Coelho
    Aki Kuusela
    Shan Li
    Hao Zhuang
    Jennifer Ngadiuba
    Thea Klaeboe Aarrestad
    Vladimir Loncar
    Maurizio Pierini
    Adrian Alan Pol
    Sioni Summers
    Nature Machine Intelligence, 2021, 3 : 675 - 686
  • [39] Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
    Coelho, Claudionor N., Jr.
    Kuusela, Aki
    Li, Shan
    Zhuang, Hao
    Ngadiuba, Jennifer
    Aarrestad, Thea Klaeboe
    Loncar, Vladimir
    Pierini, Maurizio
    Pol, Adrian Alan
    Summers, Sioni
    NATURE MACHINE INTELLIGENCE, 2021, 3 (08) : 675 - +
  • [40] Attentional Bias Pattern Recognition in Spiking Neural Networks from Spatio-Temporal EEG Data
    Doborjeh, Zohreh Gholami
    Doborjeh, Maryam G.
    Kasabov, Nikola
    COGNITIVE COMPUTATION, 2018, 10 (01) : 35 - 48