Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引：0

作者：

Wei Guan ^{[1
]}

机构：

[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang

来源：

Optical Memory and Neural Networks | 2018年 / 27卷 / 4期

关键词：

acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;

D O I：

10.3103/S1060992X18040094

中图分类号：

学科分类号：

摘要：

Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.

引用

页码：272 / 282

页数：10

共 50 条

[1] A Multi-Region Deep Neural Network Model in Speech Recognition
Cui, Jia
Saon, George
Ramabhadran, Bhuvana
Kingsbury, Brian
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3244 - 3248
[2] Speech Recognition Model for Assamese Language Using Deep Neural Network
Singh, Moirangthem Tiken
Barman, Partha Pratim
Gogoi, Rupjyoti
2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
[3] Primi Speech Recognition Based on Deep Neural Network
Hu, Wenjun
Fu, Meijun
Pan, Wenlin
2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 667 - 671
[4] Indonesian speech recognition based on Deep Neural Network
Yang, Ruolin
Yang, Jian
Lu, Yu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41
[5] Donggan speech recognition based on deep neural network
Xu, Haiyan
Yang, Hongwu
You, Yuren
PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 354 - 358
[6] Audio-Visual (Multimodal) Speech Recognition System Using Deep Neural Network
Paulin, Hebsibah
Milton, R. S.
JanakiRaman, S.
Chandraprabha, K.
JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 3963 - 3974
[7] Deep Belief Network Optimization in Speech Recognition
Prasetio, Murman Dwi
Hayashida, Tomohiro
Nishizaki, Ichiro
Sekizaki, Shinya
2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 138 - 143
[8] BILINGUAL SPEECH RECOGNITION SYSTEM FOR ISOLATED WORDS USING DEEP NEURAL NETWORK
Bharathi, B.
Kavitha, S.
Sugapriya, S.
2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 78 - 81
[9] A NETWORK OF DEEP NEURAL NETWORKS FOR DISTANT SPEECH RECOGNITION
Ravanelli, Mirco
Brakel, Philemon
Omologo, Maurizio
Bengio, Yoshua
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4880 - 4884
[10] DEEP RECURRENT REGULARIZATION NEURAL NETWORK FOR SPEECH RECOGNITION
Chien, Jen-Tzung
Lu, Tsai-Wei
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4560 - 4564

← 1 2 3 4 5 →