Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [41] Research on Speech Emotion Recognition Technology based on Deep and Shallow Neural Network
    Wang, Jian
    Han, Zhiyan
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 3555 - 3558
  • [42] Research on Acoustic Model of Speech Recognition Based on Neural Network with Improved Gating Unit
    Liu, Wei
    Yan, Yan
    Yu, Jianqiang
    Sun, Yiming
    2019 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (ICMA), 2019, : 2364 - 2368
  • [43] A study on Gaussian mixture model deep neural network hybrid-based feature compensation for robust speech recognition in noisy environments
    Yoon, Ki-mu
    Kim, Wooil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2018, 37 (06): : 506 - 511
  • [44] An unsupervised adaptation method for deep neural network-based large vocabulary continuous speech recognition
    Xiao, Yeming
    Si, Yujing
    Xu, Ji
    Pan, Jielin
    Yan, Yonghong
    Journal of Information and Computational Science, 2014, 11 (14): : 4889 - 4899
  • [45] EXPLOITING LSTM STRUCTURE IN DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION
    He, Tianxing
    Droppo, Jasha
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5445 - 5449
  • [46] Acceleration Strategies for Speech Recognition based on Deep Neural Networks
    Tian, Chao
    Liu, Jia
    Peng, Zhaomeng
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 5181 - 5185
  • [47] Recurrent Neural Network Language Model with Part-of-speech for Mandarin Speech Recognition
    Gong, Caixia
    Li, Xiangang
    Wu, Xihong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 459 - 463
  • [48] A Speech Recognition System for Bengali Language using Recurrent Neural Network
    Islam, Jahirul
    Mubassira, Masiath
    Islam, Md. Rakibul
    Das, Amit Kumar
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 73 - 76
  • [49] Speech Recognition Using Deep Neural Networks: A Systematic Review
    Nassif, Ali Bou
    Shahin, Ismail
    Attili, Imtinan
    Azzeh, Mohammad
    Shaalan, Khaled
    IEEE ACCESS, 2019, 7 : 19143 - 19165
  • [50] Optimization of Deep Neural Network for Recognition with Human Iris Biometric Measure
    Gaxiola, Fernando
    Melin, Patricia
    Valdez, Fevrier
    Castro, Juan R.
    FUZZY LOGIC IN INTELLIGENT SYSTEM DESIGN: THEORY AND APPLICATIONS, 2018, 648 : 172 - 180