Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [21] Deep neural network architectures for dysarthric speech analysis and recognition
    Zaidi, Brahim Fares
    Selouani, Sid Ahmed
    Boudraa, Malika
    Sidi Yakoub, Mohammed
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15) : 9089 - 9108
  • [22] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144
  • [23] Neural network optimization using genetic algorithms for speech recognition
    Mouria-Beji, F
    ENGINEERING INTELLIGENT SYSTEMS FOR ELECTRICAL ENGINEERING AND COMMUNICATIONS, 2002, 10 (02): : 69 - 74
  • [24] A Fuzzy Neural Network Applied in the Speech Recognition System
    Zhang, Xueying
    Wang, Peng
    Li, Gaoyun
    Hou, Wenjun
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 3, PROCEEDINGS, 2008, : 14 - +
  • [25] Neural Network Phone Duration Model for Speech Recognition
    Alumae, Tanel
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1204 - 1208
  • [26] Deep neural network with attention model for scene text recognition
    Li, Shuohao
    Tang, Min
    Guo, Qiang
    Lei, Jun
    Zhang, Jun
    IET COMPUTER VISION, 2017, 11 (07) : 605 - 612
  • [27] Performance Evaluation of an Accessory Category Recognition System Using Deep Neural Network
    Sakai, Yuki
    Oda, Tetsuya
    Ikeda, Makoto
    Barolli, Leonard
    PROCEEDINGS OF 2016 19TH INTERNATIONAL CONFERENCE ON NETWORK-BASED INFORMATION SYSTEMS (NBIS), 2016, : 437 - 441
  • [28] Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition
    Zhang, Hua
    Gou, Ruoyun
    Shang, Jili
    Shen, Fangyao
    Wu, Yifan
    Dai, Guojun
    FRONTIERS IN PHYSIOLOGY, 2021, 12
  • [29] NEW TYPES OF DEEP NEURAL NETWORK LEARNING FOR SPEECH RECOGNITION AND RELATED APPLICATIONS: AN OVERVIEW
    Deng, Li
    Hinton, Geoffrey
    Kingsbury, Brian
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8599 - 8603
  • [30] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
    You, Yongbin
    Qian, Yanmin
    Yu, Kai
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9