7 μJ/inference end-to-end gesture recognition from dynamic vision sensor data using ternarized hybrid convolutional neural networks

被引:2
作者
Rutishauser, Georg [1 ]
Scherer, Moritz [1 ]
Fischer, Tim [1 ]
Benini, Luca [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Gloriastr 35, CH-8092 Zurich, Switzerland
[2] Univ Bologna, Via Zamboni 33, I-40126 Bologna, Italy
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2023年 / 149卷
基金
欧盟地平线“2020”;
关键词
Dynamic vision sensors; Gesture recognition; Ternary neural networks; Edge computing; Low -power systems; EFFICIENT; ACCELERATOR;
D O I
10.1016/j.future.2023.07.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Dynamic vision sensor (DVS) cameras enable energy-activity proportional visual sensing by only propagating events produced by changes in the observed scene. Furthermore, by generating these events asynchronously, they offer mu s-scale latency while eliminating the redundant data transmission inherent to classical, frame-based cameras. However, the potential of DVS to improve the energy efficiency of IoT sensor nodes can only be fully realized with efficient and flexible systems that tightly integrate sensing, processing, and actuation capabilities. In this paper, we propose a complete end-to-end pipeline for DVS event data classification implemented on the Kraken parallel ultra-low power (PULP) system-on-chip and apply it to gesture recognition. A dedicated on-chip peripheral interface for DVS cameras aggregates the received events into ternary event frames. We process these video frames with a fully ternarized two-stage temporal convolutional network (TCN). The neural network can be executed either on Kraken's PULP cluster of general-purpose RISC-V cores or on CUTIE, the on-chip ternary neural network accelerator. We perform extensive ablations on network structure, training, and data generation parameters. We achieve a validation accuracy of 97.7 % on the DVS128 11-class gesture dataset, a new record for embedded implementations. With in-silicon power and energy measurements, we demonstrate a classification energy of 7 mu J at a latency of 0.9ms when running the TCN on CUTIE, a reduction of inference energy by 67x when compared to the state of the art in embedded gesture recognition. The processing system consumes as little as 4.7 mW in continuous inference, enabling always-on gesture recognition and closing the gap between the efficiency potential of DVS cameras and application scenarios. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:717 / 731
页数:15
相关论文
共 88 条
[31]   Incorporating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks [J].
Fang, Wei ;
Yu, Zhaofei ;
Chen, Yanqi ;
Masquelier, Timothee ;
Huang, Tiejun ;
Tian, Yonghong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :2641-2651
[32]  
Finateu T, 2020, ISSCC DIG TECH PAP I, P112, DOI 10.1109/ISSCC19947.2020.9063149
[33]   MorphIC: A 65-nm 738k-Synapse/mm2 Quad-Core Binary-Weight Digital Neuromorphic Processor With Stochastic Spike-Driven Online Learning [J].
Frenkel, Charlotte ;
Legat, Jean-Didier ;
Bol, David .
IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2019, 13 (05) :999-1010
[34]   A 0.086-mm2 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28-nm CMOS [J].
Frenkel, Charlotte ;
Lefebvre, Martin ;
Legat, Jean-Didier ;
Bol, David .
IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2019, 13 (01) :145-158
[35]   Event-Based Vision: A Survey [J].
Gallego, Guillermo ;
Delbruck, Tobi ;
Orchard, Garrick Michael ;
Bartolozzi, Chiara ;
Taba, Brian ;
Censi, Andrea ;
Leutenegger, Stefan ;
Davison, Andrew ;
Conradt, Jorg ;
Daniilidis, Kostas ;
Scaramuzza, Davide .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (01) :154-180
[36]   XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V Based IoT End Nodes [J].
Garofalo, Angelo ;
Tagliavini, Giuseppe ;
Conti, Francesco ;
Benini, Luca ;
Rossi, Davide .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (03) :1489-1505
[37]   PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors [J].
Garofalo, Angelo ;
Rusci, Manuele ;
Conti, Francesco ;
Rossi, Davide ;
Benini, Luca .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2164)
[38]   Near-Threshold RISC-VCore With DSP Extensions for Scalable IoT Endpoint Devices [J].
Gautschi, Michael ;
Schiavone, Pasquale Davide ;
Traber, Andreas ;
Loi, Igor ;
Pullini, Antonio ;
Rossi, Davide ;
Flamand, Eric ;
Gurkaynak, Frank K. ;
Benini, Luca .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (10) :2700-2713
[39]  
Gholami A., 2021, PREPRINT, DOI DOI 10.48550/ARXIV.2103.13630
[40]   Spiking Optical Flow for Event-Based Sensors Using IBM's TrueNorth Neurosynaptic System [J].
Haessig, Germain ;
Cassidy, Andrew ;
Alvarez, Rodrigo ;
Benosman, Ryad ;
Orchard, Garrick .
IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2018, 12 (04) :860-870