Spikeformer: Training high-performance spiking neural network with transformer

被引:4
作者
Li, Yudong [1 ]
Lei, Yunlin [1 ]
Yang, Xu [1 ]
机构
[1] Beijing Inst Technol, AETAS Lab, Beijing 100081, Peoples R China
关键词
Spiking neural networks; Transformer; Spatio-temporal attention; Deep neural architectures; Artificial intelligence; NEURONS;
D O I
10.1016/j.neucom.2024.127279
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although spiking neural networks (SNNs) have made great progress on both performance and efficiency over the last few years, their unique working pattern makes it hard to train high-performance low -latency SNNs and their development still lags behind traditional artificial neural networks (ANNs). To compensate this gap, many extraordinary works have been proposed, but these works are mainly based on the same network structure (i.e. CNN) and their performance is worse than their ANN counterparts, which limits the applications of SNNs. To this end, we propose a Transformer -based SNN, termed "Spikeformer", which outperforms its ANN counterpart on both static dataset and neuromorphic datasets. First, to deal with the problem of "data hungry"and the unstable training period exhibited in the vanilla model, we design the Convolutional Tokenizer (CT) module, which stabilizes training and improves the accuracy of the original model on DVS -Gesture by more than 16%. Besides, we integrate Spatio-Temporal Attention (STA) into Spikeformer to better incorporate the attention mechanism inside Transformer and the spatio-temporal information inherent to SNN. With our proposed method, we achieve 98.96%/75.89% top -1 accuracy on DVS-Gesture/ImageNet datasets with 16/4 simulation time steps. On DVS-CIFAR10, we further conduct energy consumption analysis and obtain 81.4%/80.3% top -1 accuracy with 4/1 time step(s), achieving 1.7/6.4x energy efficiency over its ANN counterpart. Moreover, our Spikeformer outperforms its ANN counterpart by 3.13% and 0.12% on DVS -Gesture and ImageNet respectively, indicating that Spikeformer may be a more suitable architecture for training SNNs compared to CNN. We believe that this work shall promote the development of SNNs to be in step with ANNs as much as possible. Code will be publicly available.
引用
收藏
页数:11
相关论文
共 52 条
[1]   A Low Power, Fully Event-Based Gesture Recognition System [J].
Amir, Arnon ;
Taba, Brian ;
Berg, David ;
Melano, Timothy ;
McKinstry, Jeffrey ;
Di Nolfo, Carmelo ;
Nayak, Tapan ;
Andreopoulos, Alexander ;
Garreau, Guillaume ;
Mendoza, Marcela ;
Kusnitz, Jeff ;
Debole, Michael ;
Esser, Steve ;
Delbruck, Tobi ;
Flickner, Myron ;
Modha, Dharmendra .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7388-7397
[2]  
Ba Jimmy Lei, 2016, arXiv
[3]  
Bertasius G, 2021, PR MACH LEARN RES, V139
[4]  
Beyer L., 2022, arXiv
[5]   A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input [J].
Burkitt, A. N. .
BIOLOGICAL CYBERNETICS, 2006, 95 (01) :1-19
[6]   Attention Mechanisms for Object Recognition with Event-Based Cameras [J].
Cannici, Marco ;
Ciccone, Marco ;
Romanoni, Andrea ;
Matteucci, Matteo .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1127-1136
[7]  
Cao Y., 2022, arXiv
[8]   AutoAugment: Learning Augmentation Strategies from Data [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Mane, Dandelion ;
Vasudevan, Vijay ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123
[9]   Loihi: A Neuromorphic Manycore Processor with On-Chip Learning [J].
Davies, Mike ;
Srinivasa, Narayan ;
Lin, Tsung-Han ;
Chinya, Gautham ;
Cao, Yongqiang ;
Choday, Sri Harsha ;
Dimou, Georgios ;
Joshi, Prasad ;
Imam, Nabil ;
Jain, Shweta ;
Liao, Yuyun ;
Lin, Chit-Kwan ;
Lines, Andrew ;
Liu, Ruokun ;
Mathaikutty, Deepak ;
Mccoy, Steve ;
Paul, Arnab ;
Tse, Jonathan ;
Venkataramanan, Guruguhanathan ;
Weng, Yi-Hsin ;
Wild, Andreas ;
Yang, Yoonseok ;
Wang, Hong .
IEEE MICRO, 2018, 38 (01) :82-99
[10]   TrueNorth: Accelerating From Zero to 64 Million Neurons in 10 Years [J].
DeBole, Michael V. ;
Taba, Brian ;
Amir, Arnon ;
Akopyan, Filipp ;
Andreopoulos, Alexander ;
Risk, William P. ;
Kusnitz, Jeff ;
Otero, Carlos Ortega ;
Nayak, Tapan K. ;
Appuswamy, Rathinakumar ;
Carlson, Peter J. ;
Cassidy, Andrew S. ;
Datta, Pallab ;
Esser, Steven K. ;
Garreau, Guillaume J. ;
Holland, Kevin L. ;
Lekuch, Scott ;
Mastro, Michael ;
McKinstry, Jeff ;
di Nolfo, Carmelo ;
Paulovicks, Brent ;
Sawada, Jun ;
Schleupen, Kai ;
Shaw, Benjamin G. ;
Klamo, Jennifer L. ;
Flickner, Myron D. ;
Arthur, John V. ;
Modha, Dharmendra S. .
COMPUTER, 2019, 52 (05) :20-29