A 71.2-μW Speech Recognition Accelerator With Recurrent Spiking Neural Network

被引:2
作者
Yang, Chih-Chyau [1 ,2 ]
Chang, Tian-Sheuan [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Inst Elect, Hsinchu 30010, Taiwan
[2] Taiwan Semicond Res Inst, Hsinchu 30078, Taiwan
关键词
Deep-learning accelerator; recurrent spiking neural networks; zero skipping hardware; ultra low power; model compression;
D O I
10.1109/TCSI.2024.3387993
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper introduces a 71.2- mu W speech recognition accelerator designed for edge devices' real-time applications, emphasizing an ultra low power design. Achieved through algorithm and hardware co-optimizations, we propose a compact recurrent spiking neural network with two recurrent layers, one fully connected layer, and a low time step (1 or 2). The 2.79-MB model undergoes pruning and 4-bit fixed-point quantization, shrinking it by 96.42% to 0.1 MB. On the hardware front, we take advantage of mixed-level pruning, zero-skipping and merged spike techniques, reducing complexity by 90.49% to 13.86 MMAC/S. The parallel time-step execution addresses inter-time-step data dependencies and enables weight buffer power savings through weight sharing. Capitalizing on the sparse spike activity, an input broadcasting scheme eliminates zero computations, further saving power. Implemented on the TSMC 28-nm process, the design operates in real time at 100 kHz, consuming 71.2 mu W, surpassing state-of-the-art designs. At 500 MHz, it has 28.41 TOPS/W and 1903.11 GOPS/mm(2) in energy and area efficiency, respectively.
引用
收藏
页码:3203 / 3213
页数:11
相关论文
empty
未找到相关数据