Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences

被引:112
作者
He, Weihua [1 ,2 ]
Wu, YuJie [1 ,3 ,4 ]
Deng, Lei [2 ]
Li, Guoqi [1 ,3 ,4 ]
Wang, Haoyu [1 ]
Tian, Yang [5 ]
Ding, Wei [1 ]
Wang, Wenhui [1 ]
Xie, Yuan [2 ]
机构
[1] Tsinghua Univ, Dept Precis Instrument, Beijing 100084, Peoples R China
[2] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
[3] Tsinghua Univ, Ctr Brain Inspired Comp Res, Beijing 100084, Peoples R China
[4] Tsinghua Univ, Beijing Innovat Ctr Future Chip, Beijing 100084, Peoples R China
[5] Tsinghua Univ, THBI, Lab Cognit Neurosci, Beijing 100084, Peoples R China
基金
美国国家科学基金会;
关键词
Spiking neural networks; Recurrent neural networks; Long short-term memory; Neuromorphic dataset; Spatiotemporal dynamics; SPIKING; MODEL; NETWORKS; EVENTS; SENSOR;
D O I
10.1016/j.neunet.2020.08.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion. Spiking neural networks (SNNs) represent a family of event-driven models with spatiotemporal dynamics for neuromorphic computing, which are widely benchmarked on neuromorphic data. Interestingly, researchers in the machine learning community can argue that recurrent (artificial) neural networks (RNNs) also have the capability to extract spatiotemporal features although they are not event-driven. Thus, the question of "what will happen if we benchmark these two kinds of models together on neuromorphic data"comes out but remains unclear. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data, taking the vision datasets as a case study. First, we identify the similarities and differences between SNNs and RNNs (including the vanilla RNNs and LSTM) from the modeling and learning perspectives. To improve comparability and fairness, we unify the supervised learning algorithm based on backpropagation through time (BPTT), the loss function exploiting the outputs at all timesteps, the network structure with stacked fully-connected or convolutional layers, and the hyper-parameters during training. Especially, given the mainstream loss function used in RNNs, we modify it inspired by the rate coding scheme to approach that of SNNs. Furthermore, we tune the temporal resolution of datasets to test model robustness and generalization. At last, a series of contrast experiments are conducted on two types of neuromorphic datasets: DVS-converted (N-MNIST) and DVS-captured (DVS Gesture). Extensive insights regarding recognition accuracy, feature extraction, temporal resolution and contrast, learning generalization, computational complexity and parameter volume are provided, which are beneficial for the model selection on different workloads and even for the invention of novel neural models in the future. (c) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页码:108 / 120
页数:13
相关论文
共 57 条
[1]   Lapicque's introduction of the integrate-and-fire model neuron (1907) [J].
Abbott, LF .
BRAIN RESEARCH BULLETIN, 1999, 50 (5-6) :303-304
[2]   A Low Power, Fully Event-Based Gesture Recognition System [J].
Amir, Arnon ;
Taba, Brian ;
Berg, David ;
Melano, Timothy ;
McKinstry, Jeffrey ;
Di Nolfo, Carmelo ;
Nayak, Tapan ;
Andreopoulos, Alexander ;
Garreau, Guillaume ;
Mendoza, Marcela ;
Kusnitz, Jeff ;
Debole, Michael ;
Esser, Steve ;
Delbruck, Tobi ;
Flickner, Myron ;
Modha, Dharmendra .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7388-7397
[3]  
[Anonymous], 2018, ARXIV180701013
[4]  
[Anonymous], 2021, Front Neurosci, DOI [DOI 10.3389/FNINS.2015.00481, DOI 10.3389/FNINS]
[5]  
[Anonymous], 2018, Advances in Neural Information Processing Systems
[6]   A Dataset for Visual Navigation with Neuromorphic Methods [J].
Barranco, Francisco ;
Fermuller, Cornelia ;
Aloimonos, Yiannis ;
Delbruck, Tobi .
FRONTIERS IN NEUROSCIENCE, 2016, 10
[7]  
Boden Mikael, 2002, the Dallas project
[8]  
Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/d14-1179, DOI 10.3115/V1/D14-1179]
[9]   Skimming Digits: Neuromorphic Classification of Spike-Encoded Images [J].
Cohen, Gregory K. ;
Orchard, Garrick ;
Leng, Sio-Hoi ;
Tapson, Jonathan ;
Benosman, Ryad B. ;
van Schaik, Andre .
FRONTIERS IN NEUROSCIENCE, 2016, 10
[10]  
Conradt Jorg, 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, P780, DOI 10.1109/ICCVW.2009.5457625