3D Convolutional Spiking Neural Network for Human Action Recognition Using Modulating STDP With Global Error Feedback

被引:1
作者
Nawarathne, Thoshara [1 ]
Leung, Henry [1 ]
机构
[1] Univ Calgary, Dept Elect & Comp Engn, Calgary, AB, Canada
来源
18TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE, SYSCON 2024 | 2024年
关键词
Asynchronous video; 3D convolutional spiking neural networks; action recognition; UCF; 101; dataset;
D O I
10.1109/SysCon61195.2024.10553446
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Video action recognition using 3D Convolutional Neural Networks (CNN) become an increasingly popular strategy in past years with the evolution of machine learning and computer vision. However, the higher memory and computation capacity requirement of these networks leads to the use of low-power memory-saving neural networks to perform video action recognition tasks efficiently. Spike-based information processing and computation of bio-inspired spiking convolutional neural networks perform an essential role when comes to energy efficient memory saving computation for video classification and action recognition tasks which allow on-chip real-time processing. This paper proposes a novel 3D Convolutional Spiking Neural Network (CSNN) architecture with modulating STDP supervised learning via global error feedback for human action recognition in video data. The proposed model includes two 3D convolutional layers, followed by two spiking neuron layers, modeled using Leaky Integrate and Fire (LIF) neurons for feature extraction from video data. Using the modulating STDP learning rule with global error feedback, this model can successfully recognize human actions from video data allowing online parallel computations. The proposed network experimented on two datasets: one 3D image dataset - synthesized 3D MNIST and one video dataset - UCF 101 human action recognition dataset and achieved 71.6% and 63.7% recognition accuracy.
引用
收藏
页数:6
相关论文
共 26 条
[1]  
[Anonymous], 2012, ARXIV
[2]   Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition [J].
Cao, Yongqiang ;
Chen, Yang ;
Khosla, Deepak .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) :54-66
[3]   Spike timing-dependent plasticity: A Hebbian learning rule [J].
Caporale, Natalia ;
Dan, Yang .
ANNUAL REVIEW OF NEUROSCIENCE, 2008, 31 :25-46
[4]  
Diehl PU, 2015, IEEE IJCNN
[5]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[6]   How asynchronous video interviews are used in practice: A study of an Australian-based AVI vendor [J].
Dunlop, Patrick D. ;
Holtrop, Djurre ;
Wee, Serena .
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT, 2022, 30 (03) :448-455
[7]   2D versus 3D Convolutional Spiking Neural Networks Trained with Unsupervised STDP for Human Action Recognition [J].
El-Assal, Mireille ;
Tirilly, Pierre ;
Bilasco, Ioan Marius .
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[8]  
Gerstner W, 2014, NEURONAL DYNAMICS: FROM SINGLE NEURONS TO NETWORKS AND MODELS OF COGNITION, P1, DOI 10.1017/CBO9781107447615
[9]   Slices of Attention in Asynchronous Video Job Interviews [J].
Hemamou, Leo ;
Felhi, Ghazi ;
Martin, Jean-Claude ;
Clavel, Chloe .
2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
[10]   3D Convolutional Neural Networks for Human Action Recognition [J].
Ji, Shuiwang ;
Xu, Wei ;
Yang, Ming ;
Yu, Kai .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) :221-231