Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition

被引:79
|
作者
Wu, Jibin [1 ]
Yilmaz, Emre [1 ]
Zhang, Malu [1 ]
Li, Haizhou [1 ,2 ]
Tan, Kay Chen [3 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] Univ Bremen, Fac Comp Sci & Math, Bremen, Germany
[3] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
关键词
deep spiking neural networks; automatic speech recognition; tandem learning; neuromorphic computing; acoustic modeling; HIDDEN MARKOV-MODELS; COMMUNICATION; ARCHITECTURE;
D O I
10.3389/fnins.2020.00199
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
    Jaitly, Navdeep
    Patrick Nguyen
    Senior, Andrew
    Vanhoucke, Vincent
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2577 - 2580
  • [2] EXPLOITING SPARSENESS IN DEEP NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Yu, Dong
    Seide, Frank
    Li, Gang
    Deng, Li
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4409 - 4412
  • [3] Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
    Yu, Dong
    Deng, Li
    Seide, Frank
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 6 - 9
  • [4] Large Vocabulary Speech Recognition Using Deep Neural Networks: Insights, Theory, and Practice
    Yu, Dong
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : XXXI - XXXI
  • [5] Improving Large Vocabulary Urdu Speech Recognition System using Deep Neural Networks
    Farooq, Muhammad Umar
    Adeeba, Farah
    Rauf, Sahar
    Hussain, Sarmad
    INTERSPEECH 2019, 2019, : 2978 - 2982
  • [6] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
    Espana-Bonet, Cristina
    Fonollosa, Jose A. R.
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107
  • [7] The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition
    Yu, Dong
    Deng, Li
    Seide, Frank
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 388 - 396
  • [8] Croatian Large Vocabulary Automatic Speech Recognition
    Martincic-Ipsic, Sanda
    Pobar, Miran
    Ipsic, Ivo
    AUTOMATIKA, 2011, 52 (02) : 147 - 157
  • [9] Large Vocabulary Automatic Speech Recognition for Children
    Liao, Hank
    Pundak, Golan
    Siohan, Olivier
    Carroll, Melissa K.
    Coccaro, Noah
    Jiang, Qi-Ming
    Sainath, Tara N.
    Senior, Andrew
    Beaufays, Francoise
    Bacchiani, Michiel
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1611 - 1615
  • [10] Automatic Recognition of Kazakh Speech Using Deep Neural Networks
    Mamyrbayev, Orken
    Turdalyuly, Mussa
    Mekebayev, Nurbapa
    Alimhan, Keylan
    Kydyrbekova, Aizat
    Turdalykyzy, Tolganay
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 465 - 474