Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition

被引：79

作者：

Wu, Jibin ^{[1
]}

Yilmaz, Emre ^{[1
]}

Zhang, Malu ^{[1
]}

Li, Haizhou ^{[1
,2
]}

Tan, Kay Chen ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] Univ Bremen, Fac Comp Sci & Math, Bremen, Germany

[3] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China

来源：

FRONTIERS IN NEUROSCIENCE | 2020年 / 14卷

关键词：

deep spiking neural networks; automatic speech recognition; tandem learning; neuromorphic computing; acoustic modeling; HIDDEN MARKOV-MODELS; COMMUNICATION; ARCHITECTURE;

D O I：

10.3389/fnins.2020.00199

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices.

引用

页数：14

共 50 条

[1] Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
Jaitly, Navdeep
Patrick Nguyen
Senior, Andrew
Vanhoucke, Vincent
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2577 - 2580
[2] EXPLOITING SPARSENESS IN DEEP NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
Yu, Dong
Seide, Frank
Li, Gang
Deng, Li
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4409 - 4412
[3] Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
Yu, Dong
Deng, Li
Seide, Frank
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 6 - 9
[4] Large Vocabulary Speech Recognition Using Deep Neural Networks: Insights, Theory, and Practice
Yu, Dong
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : XXXI - XXXI
[5] Improving Large Vocabulary Urdu Speech Recognition System using Deep Neural Networks
Farooq, Muhammad Umar
Adeeba, Farah
Rauf, Sahar
Hussain, Sarmad
INTERSPEECH 2019, 2019, : 2978 - 2982
[6] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
Espana-Bonet, Cristina
Fonollosa, Jose A. R.
ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107
[7] The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition
Yu, Dong
Deng, Li
Seide, Frank
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 388 - 396
[8] Croatian Large Vocabulary Automatic Speech Recognition
Martincic-Ipsic, Sanda
Pobar, Miran
Ipsic, Ivo
AUTOMATIKA, 2011, 52 (02) : 147 - 157
[9] Large Vocabulary Automatic Speech Recognition for Children
Liao, Hank
Pundak, Golan
Siohan, Olivier
Carroll, Melissa K.
Coccaro, Noah
Jiang, Qi-Ming
Sainath, Tara N.
Senior, Andrew
Beaufays, Francoise
Bacchiani, Michiel
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1611 - 1615
[10] Automatic Recognition of Kazakh Speech Using Deep Neural Networks
Mamyrbayev, Orken
Turdalyuly, Mussa
Mekebayev, Nurbapa
Alimhan, Keylan
Kydyrbekova, Aizat
Turdalykyzy, Tolganay
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 465 - 474

← 1 2 3 4 5 →