FPGA-based Low-power Speech Recognition with Recurrent Neural Networks

被引:38
作者
Lee, Minjae [1 ]
Hwang, Kyuyeon [1 ]
Park, Jinhwan [1 ]
Choi, Sungwook [1 ]
Shin, Sungho [1 ]
Sung, Wonyong [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, 1 Gwanak Ro, Seoul 08826, South Korea
来源
2016 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS) | 2016年
关键词
D O I
10.1109/SiPS.2016.48
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a neural network based real-time speech recognition (SR) system is developed using an FPGA for very low-power operation. The implemented system employs two recurrent neural networks (RNNs); one is a speech-to-character RNN for acoustic modeling (AM) and the other is for character-level language modeling (LM). The system also employs a statistical word-level LM to improve the recognition accuracy. The results of the AM, the character-level LM, and the word-level LM are combined using a fairly simple N-best search algorithm instead of the hidden Markov model (HMM) based network. The RNNs are implemented using massively parallel processing elements (PEs) for low latency and high throughput. The weights are quantized to 6 bits to store all of them in the on-chip memory of an FPGA. The proposed algorithm is implemented on a Xilinx XC7Z045, and the system can operate much faster than real-time.
引用
收藏
页码:230 / 235
页数:6
相关论文
共 29 条
[1]  
Amodei D, 2016, PR MACH LEARN RES, V48
[2]  
[Anonymous], 2015, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
[3]  
[Anonymous], 2011, P 28 INT C MACH LEAR
[4]  
Anwar S, 2015, INT CONF ACOUST SPEE, P1131, DOI 10.1109/ICASSP.2015.7178146
[5]   AN FPGA IMPLEMENTATION OF SPEECH RECOGNITION WITH WEIGHTED FINITE STATE TRANSDUCERS [J].
Choi, Jungwook ;
You, Kisun ;
Sung, Wonyong .
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :1602-1605
[6]  
Federico M, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P1618
[7]   VITERBI ALGORITHM [J].
FORNEY, GD .
PROCEEDINGS OF THE IEEE, 1973, 61 (03) :268-278
[8]  
Graves A., 2006, PROC INT C MACH LEAR, P369, DOI DOI 10.1145/1143844.1143891
[9]  
Graves A, 2014, PR MACH LEARN RES, V32, P1764
[10]   EIE: Efficient Inference Engine on Compressed Deep Neural Network [J].
Han, Song ;
Liu, Xingyu ;
Mao, Huizi ;
Pu, Jing ;
Pedram, Ardavan ;
Horowitz, Mark A. ;
Dally, William J. .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :243-254