Gated Time Delay Neural Network for Speech Recognition

被引:2
|
作者
Chen, Kaibin [1 ]
Zhang, Weibin [1 ]
Chen, Dongpeng [2 ]
Huang, Xiaorong [1 ]
Liu, Boji [1 ]
Xu, Xiangmin [1 ]
机构
[1] South China Univ Technol, Guangzhou, Guangdong, Peoples R China
[2] VoiceAI Technol, Shenzhen, Peoples R China
关键词
D O I
10.1088/1742-6596/1229/1/012077
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In deep neural networks, the gate mechanism is a very effective tool for controlling the information flow. For example, the gates of Long Short-Term Memory (LSTM) help alleviate the gradient vanishing problem. In addition, these gates preserve useful information. We believe that it will benefit if the system learns to explicitly focus on the relevant dimensions of the input. In this paper, we propose Gated Time Delay Neural Networks (Gated TDNN) for speech recognition. Time-delay layers are utilized to model the long temporal context correlation of speech signal while the gate mechanism enables the model to discover the relevant dimensions of the input. Our experimental results on the Switchboard and the Librispeech data sets demonstrate the effectiveness of the proposed method.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Time Delay Recurrent Neural Network for Speech Recognition
    Liu, Boji
    Zhang, Weibin
    Xu, Xiangming
    Chen, Dongpeng
    2019 3RD INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2019), 2019, 1229
  • [2] Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition
    Srun, Nalin
    Leang, Sotheara
    Thu, Ye Kyaw
    Sam, Sethserey
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [3] Gated Module Neural Network for Multilingual Speech Recognition
    Liao, Yuan-Fu
    Pleva, Matus
    Hladek, Daniel
    Stas, Jan
    Viszlay, Peter
    Lojka, Martin
    Juhar, Jozef
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 131 - 135
  • [4] Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network
    Wu, Fei
    Garcia, Leibny Paola
    Povey, Daniel
    Khudanpur, Sanjeev
    INTERSPEECH 2019, 2019, : 1 - 5
  • [5] Attention gated tensor neural network architectures for speech emotion recognition
    Pandey, Sandeep Kumar
    Shekhawat, Hanumant Singh
    Prasanna, S. R. M.
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 71
  • [6] Dysarthric Speech Recognition using Time-delay Neural Network based Denoising Autoencoder
    Bhat, Chitralekha
    Das, Biswajit
    Vachhani, Bhavik
    Kopparapu, Sunil Kumar
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 451 - 455
  • [7] Modular Construction of Time-Delay Neural Networks for Speech Recognition
    Waibel, Alex
    NEURAL COMPUTATION, 1989, 1 (01) : 39 - 46
  • [8] Recurrent neural network with backpropagation through time for speech recognition
    Ahmad, AM
    Ismail, S
    Samaon, DF
    IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2: SMART INFO-MEDIA SYSTEMS, 2004, : 98 - 102
  • [9] A Time Delay Neural Network for Online Arabic Handwriting Recognition
    Zouari, Ramzi
    Boubaker, Houcine
    Kherallah, Monji
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA 2016), 2017, 557 : 1005 - 1014
  • [10] Hindi speech recognition using time delay neural network acoustic modeling with i-vector adaptation
    Ankit Kumar
    Rajesh Kumar Aggarwal
    International Journal of Speech Technology, 2022, 25 : 67 - 78