Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition

被引:79
|
作者
Wu, Jibin [1 ]
Yilmaz, Emre [1 ]
Zhang, Malu [1 ]
Li, Haizhou [1 ,2 ]
Tan, Kay Chen [3 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] Univ Bremen, Fac Comp Sci & Math, Bremen, Germany
[3] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
关键词
deep spiking neural networks; automatic speech recognition; tandem learning; neuromorphic computing; acoustic modeling; HIDDEN MARKOV-MODELS; COMMUNICATION; ARCHITECTURE;
D O I
10.3389/fnins.2020.00199
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A novel learning approach in deep spiking neural networks with multi-objective optimization algorithms for automatic digit speech recognition
    Melika Hamian
    Karim Faez
    Soheila Nazari
    Malihe Sabeti
    The Journal of Supercomputing, 2023, 79 : 20263 - 20288
  • [22] A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition
    Li, Xiangang
    Yang, Yuning
    Pang, Zaihu
    Wu, Xihong
    NEUROCOMPUTING, 2015, 170 : 251 - 256
  • [23] CONSTRUCTING LONG SHORT-TERM MEMORY BASED DEEP RECURRENT NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Li, Xianggang
    Wu, Xihong
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4520 - 4524
  • [24] ADAPTATION OF CONTEXT-DEPENDENT DEEP NEURAL NETWORKS FOR AUTOMATIC SPEECH RECOGNITION
    Yao, Kaisheng
    Yu, Dong
    Seide, Frank
    Su, Hang
    Deng, Li
    Gong, Yifan
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 366 - 369
  • [25] DEEP NEURAL NETWORKS BASED AUTOMATIC SPEECH RECOGNITION FOR FOUR ETHIOPIAN LANGUAGES
    Abate, Solomon Teferra
    Tachbelie, Martha Ylfiru
    Schultz, Tanja
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8274 - 8278
  • [26] Automatic Speech Recognition Based on Neural Networks
    Schlueter, Ralf
    Doetsch, Patrick
    Golik, Pavel
    Kitza, Markus
    Menne, Tobias
    Irie, Kazuki
    Tueske, Zoltan
    Zeyer, Albert
    SPEECH AND COMPUTER, 2016, 9811 : 3 - 17
  • [27] Speech Command Recognition Based on Convolutional Spiking Neural Networks
    Sadovsky, Erik
    Jakubec, Maros
    Jarina, Roman
    2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
  • [28] Neuromorphic Speech Recognition With Photonic Convolutional Spiking Neural Networks
    Xiang, Shuiying
    Zhang, Tianrui
    Han, Yanan
    Guo, Xingxing
    Zhang, Yahui
    Shi, Yuechun
    Hao, Yue
    IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, 2023, 29 (06)
  • [29] NEURON SPARSENESS VERSUS CONNECTION SPARSENESS IN DEEP NEURAL NETWORK FOR LARGE VOCABULARY SPEECH RECOGNITION
    Kang, Jian
    Lu, Cheng
    Cai, Meng
    Zhang, Wei-Qiang
    Liu, Jia
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4954 - 4958
  • [30] DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Zhang, Shiliang
    Lei, Ming
    Yan, Zhijie
    Dai, Lirong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5869 - 5873