7 μJ/inference end-to-end gesture recognition from dynamic vision sensor data using ternarized hybrid convolutional neural networks

被引:2
作者
Rutishauser, Georg [1 ]
Scherer, Moritz [1 ]
Fischer, Tim [1 ]
Benini, Luca [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Gloriastr 35, CH-8092 Zurich, Switzerland
[2] Univ Bologna, Via Zamboni 33, I-40126 Bologna, Italy
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2023年 / 149卷
基金
欧盟地平线“2020”;
关键词
Dynamic vision sensors; Gesture recognition; Ternary neural networks; Edge computing; Low -power systems; EFFICIENT; ACCELERATOR;
D O I
10.1016/j.future.2023.07.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Dynamic vision sensor (DVS) cameras enable energy-activity proportional visual sensing by only propagating events produced by changes in the observed scene. Furthermore, by generating these events asynchronously, they offer mu s-scale latency while eliminating the redundant data transmission inherent to classical, frame-based cameras. However, the potential of DVS to improve the energy efficiency of IoT sensor nodes can only be fully realized with efficient and flexible systems that tightly integrate sensing, processing, and actuation capabilities. In this paper, we propose a complete end-to-end pipeline for DVS event data classification implemented on the Kraken parallel ultra-low power (PULP) system-on-chip and apply it to gesture recognition. A dedicated on-chip peripheral interface for DVS cameras aggregates the received events into ternary event frames. We process these video frames with a fully ternarized two-stage temporal convolutional network (TCN). The neural network can be executed either on Kraken's PULP cluster of general-purpose RISC-V cores or on CUTIE, the on-chip ternary neural network accelerator. We perform extensive ablations on network structure, training, and data generation parameters. We achieve a validation accuracy of 97.7 % on the DVS128 11-class gesture dataset, a new record for embedded implementations. With in-silicon power and energy measurements, we demonstrate a classification energy of 7 mu J at a latency of 0.9ms when running the TCN on CUTIE, a reduction of inference energy by 67x when compared to the state of the art in embedded gesture recognition. The processing system consumes as little as 4.7 mW in continuous inference, enabling always-on gesture recognition and closing the gap between the efficiency potential of DVS cameras and application scenarios. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:717 / 731
页数:15
相关论文
共 88 条
  • [1] Alemdar H, 2017, IEEE IJCNN, P2547, DOI 10.1109/IJCNN.2017.7966166
  • [2] A Low Power, Fully Event-Based Gesture Recognition System
    Amir, Arnon
    Taba, Brian
    Berg, David
    Melano, Timothy
    McKinstry, Jeffrey
    Di Nolfo, Carmelo
    Nayak, Tapan
    Andreopoulos, Alexander
    Garreau, Guillaume
    Mendoza, Marcela
    Kusnitz, Jeff
    Debole, Michael
    Esser, Steve
    Delbruck, Tobi
    Flickner, Myron
    Modha, Dharmendra
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7388 - 7397
  • [3] ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator
    Andri, Renzo
    Karunaratne, Geethan
    Cavigelli, Lukas
    Benini, Luca
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [4] Glove-Based Hand Gesture Recognition for Diver Communication
    Antillon, Derek W. Orbaugh
    Walker, Christopher R.
    Rosset, Samuel
    Anderson, Iain A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) : 9874 - 9886
  • [5] Banner R, 2019, ADV NEUR IN, V32
  • [6] Basuki Akbari Indra, 2022, 2022 5th International Conference on Networking, Information Systems and Security: Envisage Intelligent Systems in 5g//6G-based Interconnected Digital Worlds (NISS), P1, DOI 10.1109/NISS55057.2022.10085020
  • [7] Online Learning and Classification of EMG-Based Gestures on a Parallel Ultra-Low Power Platform Using Hyperdimensional Computing
    Benatti, Simone
    Montagna, Fabio
    Kartsch, Victor
    Rahimi, Abbas
    Rossi, Davide
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2019, 13 (03) : 516 - 528
  • [8] Benatti S, 2017, 2017 7TH IEEE INTERNATIONAL WORKSHOP ON ADVANCES IN SENSORS AND INTERFACES (IWASI), P139, DOI 10.1109/IWASI.2017.7974234
  • [9] A Versatile Embedded Platform for EMG Acquisition and Gesture Recognition
    Benatti, Simone
    Casamassima, Filippo
    Milosevic, Bojan
    Farella, Elisabetta
    Schoenle, Philipp
    Fateh, Schekeb
    Burger, Thomas
    Huang, Qiuting
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2015, 9 (05) : 620 - 630
  • [10] Bengio Y, 2013, Arxiv, DOI arXiv:1308.3432