Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引：0

作者：

Wei Guan ^{[1
]}

机构：

[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang

来源：

Optical Memory and Neural Networks | 2018年 / 27卷 / 4期

关键词：

acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;

D O I：

10.3103/S1060992X18040094

中图分类号：

学科分类号：

摘要：

Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.

引用

页码：272 / 282

页数：10

共 50 条

[31] A Gender-Aware Deep Neural Network Structure for Speech Recognition
Toktam Zoughi
Mohammad Mehdi Homayounpour
Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2019, 43 : 635 - 644
[32] A Gender-Aware Deep Neural Network Structure for Speech Recognition
Zoughi, Toktam
Homayounpour, Mohammad Mehdi
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2019, 43 (03) : 635 - 644
[33] Artificial Bandwidth Extension Using H∞ Optimization, Deep Neural Network, and Speech Production Model
Gupta, Deepika
Shekhawat, Hanumant Singh
2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,
[34] Structural Optimization of Deep Belief Network Theorem for Classification in Speech Recognition
Prasetio, Murman Dwi
Hayashida, Tomohiro
Nishizaki, Ichiro
Sekizaki, Shinya
2017 IEEE 10TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (IWCIA), 2017, : 121 - 128
[35] Noisy training for deep neural networks in speech recognition
Shi Yin
Chao Liu
Zhiyong Zhang
Yiye Lin
Dong Wang
Javier Tejedor
Thomas Fang Zheng
Yinguo Li
EURASIP Journal on Audio, Speech, and Music Processing, 2015
[36] Noisy training for deep neural networks in speech recognition
Yin, Shi
Liu, Chao
Zhang, Zhiyong
Lin, Yiye
Wang, Dong
Tejedor, Javier
Zheng, Thomas Fang
Li, Yinguo
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 14
[37] Deep neural network training for whispered speech recognition using small databases and generative model sampling
Ghaffarzadegan S.
Bořil H.
Hansen J.H.L.
Hansen, John H. L. (john.hansen@utdallas.edu), 1600, Springer Science and Business Media, LLC (20): : 1063 - 1075
[38] Speech recognition using a stereo vision neural network model
Tetsuro Kitazoe
Sung-Ill Kim
Tomoyuki Ichiki
Artificial Life and Robotics, 2000, 4 (1) : 37 - 41
[39] NEURON SPARSENESS VERSUS CONNECTION SPARSENESS IN DEEP NEURAL NETWORK FOR LARGE VOCABULARY SPEECH RECOGNITION
Kang, Jian
Lu, Cheng
Cai, Meng
Zhang, Wei-Qiang
Liu, Jia
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4954 - 4958
[40] Isolated Word Speech Recognition System Using Deep Neural Networks
Dhanashri, Dhavale
Dhonde, S. B.
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 1, 2017, 468 : 9 - 17

← 1 2 3 4 5 →