A Spatial-Channel-Temporal-Fused Attention for Spiking Neural Networks

被引:9
作者
Cai, Wuque [1 ]
Sun, Hongze [1 ]
Liu, Rui [1 ]
Cui, Yan [2 ]
Wang, Jun [1 ]
Xia, Yang [1 ]
Yao, Dezhong [1 ,3 ,4 ]
Guo, Daqing [1 ]
机构
[1] Univ Elect Sci & Technol China, Clin Hosp Chengdu Brain Sci Inst, Sch Life Sci & Technol, Minist Educ MOE Key Lab NeuroInformat, Chengdu 611731, Peoples R China
[2] Univ Elect Sci & Technol China, Sichuan Prov Peoples Hosp, Dept Neurosurg, Chengdu 610072, Peoples R China
[3] Chinese Acad Med Sci, Res Unit NeuroInformat 2019RU035, Chengdu 611731, Peoples R China
[4] Zhengzhou Univ, Sch Elect Engn, Zhengzhou 450001, Peoples R China
基金
中国国家自然科学基金;
关键词
Neurons; Visualization; Biological system modeling; Training; Computational modeling; Membrane potentials; Biological neural networks; Event streams; predictive attentional remapping; spatial-channel-temporal-fused attention (SCTFA); spiking neural networks (SNNs); visual attention; VISUAL-ATTENTION;
D O I
10.1109/TNNLS.2023.3278265
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spiking neural networks (SNNs) mimic brain computational strategies, and exhibit substantial capabilities in spatiotemporal information processing. As an essential factor for human perception, visual attention refers to the dynamic process for selecting salient regions in biological vision systems. Although visual attention mechanisms have achieved great success in computer vision applications, they are rarely introduced into SNNs. Inspired by experimental observations on predictive attentional remapping, we propose a new spatial-channel-temporal-fused attention (SCTFA) module that can guide SNNs to efficiently capture underlying target regions by utilizing accumulated historical spatial-channel information in the present study. Through a systematic evaluation on three event stream datasets (DVS Gesture, SL-Animals-DVS, and MNIST-DVS), we demonstrate that the SNN with the SCTFA module (SCTFA-SNN) not only significantly outperforms the baseline SNN (BL-SNN) and two other SNN models with degenerated attention modules, but also achieves competitive accuracy with the existing state-of-the-art (SOTA) methods. Additionally, our detailed analysis shows that the proposed SCTFA-SNN model has strong robustness to noise and outstanding stability when faced with incomplete data, while maintaining acceptable complexity and efficiency. Overall, these findings indicate that incorporating appropriate cognitive mechanisms of the brain may provide a promising approach to elevate the capabilities of SNNs.
引用
收藏
页码:14315 / 14329
页数:15
相关论文
共 76 条
[1]   True North: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip [J].
Akopyan, Filipp ;
Sawada, Jun ;
Cassidy, Andrew ;
Alvarez-Icaza, Rodrigo ;
Arthur, John ;
Merolla, Paul ;
Imam, Nabil ;
Nakamura, Yutaka ;
Datta, Pallab ;
Nam, Gi-Joon ;
Taba, Brian ;
Beakes, Michael ;
Brezzo, Bernard ;
Kuang, Jente B. ;
Manohar, Rajit ;
Risk, William P. ;
Jackson, Bryan ;
Modha, Dharmendra S. .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 34 (10) :1537-1557
[2]   A Low Power, Fully Event-Based Gesture Recognition System [J].
Amir, Arnon ;
Taba, Brian ;
Berg, David ;
Melano, Timothy ;
McKinstry, Jeffrey ;
Di Nolfo, Carmelo ;
Nayak, Tapan ;
Andreopoulos, Alexander ;
Garreau, Guillaume ;
Mendoza, Marcela ;
Kusnitz, Jeff ;
Debole, Michael ;
Esser, Steve ;
Delbruck, Tobi ;
Flickner, Myron ;
Modha, Dharmendra .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7388-7397
[3]   Time-Ordered Recent Event (TORE) Volumes for Event Cameras [J].
Baldwin, R. Wes ;
Liu, Ruixu ;
Almatrafi, Mohammed ;
Asari, Vijayan ;
Hirakawa, Keigo .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) :2519-2532
[4]   A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks [J].
Bing, Zhenshan ;
Meschede, Claus ;
Roehrbein, Florian ;
Huang, Kai ;
Knoll, Alois C. .
FRONTIERS IN NEUROROBOTICS, 2018, 12
[5]   Attention Mechanisms for Object Recognition with Event-Based Cameras [J].
Cannici, Marco ;
Ciccone, Marco ;
Romanoni, Andrea ;
Matteucci, Matteo .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1127-1136
[6]   Visual attention: The past 25 years [J].
Carrasco, Marisa .
VISION RESEARCH, 2011, 51 (13) :1484-1525
[7]   Grad-CAM plus plus : Generalized Gradient-based Visual Explanations for Deep Convolutional Networks [J].
Chattopadhay, Aditya ;
Sarkar, Anirban ;
Howlader, Prantik ;
Balasubramanian, Vineeth N. .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :839-847
[8]  
Cheng X, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1519
[9]   Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization [J].
Deng, Lei ;
Wu, Yujie ;
Hu, Yifan ;
Liang, Ling ;
Li, Guoqi ;
Hu, Xing ;
Ding, Yufei ;
Li, Peng ;
Xie, Yuan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (06) :2791-2805
[10]  
Fang W., 2021, P IEEECVF INT C COMP, P2661