Unlocking the Potential of Spiking Neural Networks: Understanding the What, Why, and Where

被引:6
作者
Wickramasinghe, Buddhi [1 ]
Chowdhury, Sayeed Shafayet [1 ]
Kosta, Adarsh Kumar [1 ]
Ponghiran, Wachirawit [1 ]
Roy, Kaushik [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Neurons; Training; Task analysis; Encoding; Membrane potentials; Computational modeling; Recurrent neural networks; Automatic speech recognition; neuromorphic computing; optical flow estimation; spiking neural networks (SNNs); MEMORY;
D O I
10.1109/TCDS.2023.3329747
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spiking neural networks (SNNs) are a promising avenue for machine learning with superior energy efficiency compared to traditional artificial neural networks (ANNs). Recent advances in training and input encoding have put SNNs on par with state-of-the-art ANNs in image classification. However, such tasks do not utilize the internal dynamics of SNNs fully. Notably, a spiking neuron's membrane potential acts as an internal memory, merging incoming inputs sequentially. This recurrent dynamic enables the networks to learn temporal correlations, making SNNs suitable for sequential learning. Such problems can also be tackled using ANNs. However, to capture the temporal dependencies, either the inputs have to be lumped over time (e.g., Transformers); or explicit recurrence needs to be introduced [e.g., recurrent neural networks (RNNs) and long-short-term memory (LSTM) networks], which incurs considerable complexity. To that end, we explore the capabilities of SNNs in providing lightweight solutions to four sequential tasks involving text, speech, and vision. Our results demonstrate that SNNs, by leveraging their intrinsic memory, can be an efficient alternative to RNNs and LSTMs for sequence processing, especially for certain edge applications. Furthermore, SNNs can be combined with ANNs (hybrid networks) synergistically to obtain the best of both worlds in terms of accuracy and efficiency.
引用
收藏
页码:1648 / 1663
页数:16
相关论文
共 102 条
[1]   A Low Power, Fully Event-Based Gesture Recognition System [J].
Amir, Arnon ;
Taba, Brian ;
Berg, David ;
Melano, Timothy ;
McKinstry, Jeffrey ;
Di Nolfo, Carmelo ;
Nayak, Tapan ;
Andreopoulos, Alexander ;
Garreau, Guillaume ;
Mendoza, Marcela ;
Kusnitz, Jeff ;
Debole, Michael ;
Esser, Steve ;
Delbruck, Tobi ;
Flickner, Myron ;
Modha, Dharmendra .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7388-7397
[2]  
Amodei D, 2016, PR MACH LEARN RES, V48
[3]  
[Anonymous], 1997, Neural-network models of cognition, DOI DOI 10.1016/S0166-4115(97)80111-2
[4]  
[Anonymous], 1994, Connectionist Speech Recognition: A Hybrid Approach
[5]   ViViT: A Video Vision Transformer [J].
Arnab, Anurag ;
Dehghani, Mostafa ;
Heigold, Georg ;
Sun, Chen ;
Lucic, Mario ;
Schmid, Cordelia .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :6816-6826
[6]  
Bellec G, 2018, ADV NEUR IN, V31
[7]   A solution to the learning dilemma for recurrent networks of spiking neurons [J].
Bellec, Guillaume ;
Scherr, Franz ;
Subramoney, Anand ;
Hajek, Elias ;
Salaj, Darjan ;
Legenstein, Robert ;
Maass, Wolfgang .
NATURE COMMUNICATIONS, 2020, 11 (01)
[8]   Neurogrid: A Mixed-Analog-Digital Multichip System for Large-Scale Neural Simulations [J].
Benjamin, Ben Varkey ;
Gao, Peiran ;
McQuinn, Emmett ;
Choudhary, Swadesh ;
Chandrasekaran, Anand R. ;
Bussat, Jean-Marie ;
Alvarez-Icaza, Rodrigo ;
Arthur, John V. ;
Merolla, Paul A. ;
Boahen, Kwabena .
PROCEEDINGS OF THE IEEE, 2014, 102 (05) :699-716
[9]   A 240 x 180 130 dB 3 μs Latency Global Shutter Spatiotemporal Vision Sensor [J].
Brandli, Christian ;
Berner, Raphael ;
Yang, Minhao ;
Liu, Shih-Chii ;
Delbruck, Tobi .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2014, 49 (10) :2333-2341
[10]  
Bu T, 2023, Arxiv, DOI arXiv:2303.04347