Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [31] A Gender-Aware Deep Neural Network Structure for Speech Recognition
    Toktam Zoughi
    Mohammad Mehdi Homayounpour
    Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2019, 43 : 635 - 644
  • [32] A Gender-Aware Deep Neural Network Structure for Speech Recognition
    Zoughi, Toktam
    Homayounpour, Mohammad Mehdi
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2019, 43 (03) : 635 - 644
  • [33] Artificial Bandwidth Extension Using H∞ Optimization, Deep Neural Network, and Speech Production Model
    Gupta, Deepika
    Shekhawat, Hanumant Singh
    2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,
  • [34] Structural Optimization of Deep Belief Network Theorem for Classification in Speech Recognition
    Prasetio, Murman Dwi
    Hayashida, Tomohiro
    Nishizaki, Ichiro
    Sekizaki, Shinya
    2017 IEEE 10TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (IWCIA), 2017, : 121 - 128
  • [35] Noisy training for deep neural networks in speech recognition
    Shi Yin
    Chao Liu
    Zhiyong Zhang
    Yiye Lin
    Dong Wang
    Javier Tejedor
    Thomas Fang Zheng
    Yinguo Li
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [36] Noisy training for deep neural networks in speech recognition
    Yin, Shi
    Liu, Chao
    Zhang, Zhiyong
    Lin, Yiye
    Wang, Dong
    Tejedor, Javier
    Zheng, Thomas Fang
    Li, Yinguo
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 14
  • [37] Deep neural network training for whispered speech recognition using small databases and generative model sampling
    Ghaffarzadegan S.
    Bořil H.
    Hansen J.H.L.
    Hansen, John H. L. (john.hansen@utdallas.edu), 1600, Springer Science and Business Media, LLC (20): : 1063 - 1075
  • [38] Speech recognition using a stereo vision neural network model
    Tetsuro Kitazoe
    Sung-Ill Kim
    Tomoyuki Ichiki
    Artificial Life and Robotics, 2000, 4 (1) : 37 - 41
  • [39] NEURON SPARSENESS VERSUS CONNECTION SPARSENESS IN DEEP NEURAL NETWORK FOR LARGE VOCABULARY SPEECH RECOGNITION
    Kang, Jian
    Lu, Cheng
    Cai, Meng
    Zhang, Wei-Qiang
    Liu, Jia
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4954 - 4958
  • [40] Isolated Word Speech Recognition System Using Deep Neural Networks
    Dhanashri, Dhavale
    Dhonde, S. B.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 1, 2017, 468 : 9 - 17