Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition

被引：79

作者：

Wu, Jibin ^{[1
]}

Yilmaz, Emre ^{[1
]}

Zhang, Malu ^{[1
]}

Li, Haizhou ^{[1
,2
]}

Tan, Kay Chen ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] Univ Bremen, Fac Comp Sci & Math, Bremen, Germany

[3] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China

来源：

FRONTIERS IN NEUROSCIENCE | 2020年 / 14卷

关键词：

deep spiking neural networks; automatic speech recognition; tandem learning; neuromorphic computing; acoustic modeling; HIDDEN MARKOV-MODELS; COMMUNICATION; ARCHITECTURE;

D O I：

10.3389/fnins.2020.00199

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices.

引用

页数：14

共 50 条

[21] A novel learning approach in deep spiking neural networks with multi-objective optimization algorithms for automatic digit speech recognition
Melika Hamian
Karim Faez
Soheila Nazari
Malihe Sabeti
The Journal of Supercomputing, 2023, 79 : 20263 - 20288
[22] A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition
Li, Xiangang
Yang, Yuning
Pang, Zaihu
Wu, Xihong
NEUROCOMPUTING, 2015, 170 : 251 - 256
[23] CONSTRUCTING LONG SHORT-TERM MEMORY BASED DEEP RECURRENT NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
Li, Xianggang
Wu, Xihong
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4520 - 4524
[24] ADAPTATION OF CONTEXT-DEPENDENT DEEP NEURAL NETWORKS FOR AUTOMATIC SPEECH RECOGNITION
Yao, Kaisheng
Yu, Dong
Seide, Frank
Su, Hang
Deng, Li
Gong, Yifan
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 366 - 369
[25] DEEP NEURAL NETWORKS BASED AUTOMATIC SPEECH RECOGNITION FOR FOUR ETHIOPIAN LANGUAGES
Abate, Solomon Teferra
Tachbelie, Martha Ylfiru
Schultz, Tanja
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8274 - 8278
[26] Automatic Speech Recognition Based on Neural Networks
Schlueter, Ralf
Doetsch, Patrick
Golik, Pavel
Kitza, Markus
Menne, Tobias
Irie, Kazuki
Tueske, Zoltan
Zeyer, Albert
SPEECH AND COMPUTER, 2016, 9811 : 3 - 17
[27] Speech Command Recognition Based on Convolutional Spiking Neural Networks
Sadovsky, Erik
Jakubec, Maros
Jarina, Roman
2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
[28] Neuromorphic Speech Recognition With Photonic Convolutional Spiking Neural Networks
Xiang, Shuiying
Zhang, Tianrui
Han, Yanan
Guo, Xingxing
Zhang, Yahui
Shi, Yuechun
Hao, Yue
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, 2023, 29 (06)
[29] NEURON SPARSENESS VERSUS CONNECTION SPARSENESS IN DEEP NEURAL NETWORK FOR LARGE VOCABULARY SPEECH RECOGNITION
Kang, Jian
Lu, Cheng
Cai, Meng
Zhang, Wei-Qiang
Liu, Jia
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4954 - 4958
[30] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Zhang, Shiliang
Lei, Ming
Yan, Zhijie
Dai, Lirong
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873

← 1 2 3 4 5 →