A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters

被引：1

作者：

Mao, Congmin ^{[1
]}

Liu, Sujing ^{[1
]}

机构：

[1] Hebei GEO Univ, Huaxin Coll, 69 Wufan Rd,Airport Ind Pk, Shijiazhuang 050700, Hebei, Peoples R China

来源：

JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS | 2024年 / 28卷 / 03期

关键词：

English; speech feature parameters; back- propagation neural network; speech recognition; mel- frequency cepstral coefficient;

D O I：

10.20965/jaciii.2024.p0679

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this study, from the perspective of English speech feature parameters, two feature parameters, the melfrequency cepstral coefficient (MFCC) and filter bank (Fbank), were selected to identify English speech. The algorithms used for recognition employed the classical back-propagation neural network (BPNN), recurrent neural network (RNN), and long short-term memory (LSTM) that were obtained by improving RNN. The three recognition algorithms were compared in the experiments, and the effects of the two feature parameters on the performance of the recognition algorithms were also compared. The LSTM model had the best identification performance among the three neural networks under different experimental environments; the neural network model using the MFCC feature parameter outperformed the neural network using the Fbank feature parameter; the LSTM model had the highest correct rate and the highest speed, while the RNN model ranked second, and the BPNN model ranked worst. The results confirm that the recognition can achieve higher speech recognition accuracy compared to other neural networks.

引用

页码：679 / 684

页数：6

共 20 条

[1] UNIFIED END-TO-END SPEECH RECOGNITION AND ENDPOINTING FOR FAST AND EFFICIENT SPEECH SYSTEMS
Bijwadia, Shaan
Chang, Shuo-yiin
Li, Bo
Sainath, Tara
Zhang, Chao
He, Yanzhang
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 310 - 316
[2] Bonthu K. K., 2023, 2023 INT C ADV EL CO, P33, DOI [10.1109/ICAECIS58353.2023.10170257, DOI 10.1109/ICAECIS58353.2023.10170257]
[3] Chouhan K., 2021, LINGUISTICA ANTVERPI, P2785
[4] Cudequest B., 2020, J. Audio Eng. Soc., V68, P690
[5] Artificial intelligence speech recognition model for correcting spoken English teaching
Duan Ran
Wang Yingli
Qin Haoxin
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (02) : 3513 - 3524
[6] Elharati H. A., 2020, J. Comput. Commun., V8, P28, DOI [10.4236/jcc.2020.83003, DOI 10.4236/JCC.2020.83003]
[7] [韩天 Han Tian], 2019, [吉林大学学报. 工学版, Journal of Jilin University. Engineering and Technology Edition], V49, P313
[8] Intelligent model for speech recognition based on SVM: A case study on English language
Hou, Qian
Li, Cuijuan
Kang, Min
Zhao, Xin
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (02) : 2721 - 2731
[9] Isnawati A. F., 2019, J. Nas. Tek. Elektro Teknol. Inf., V8, P340
[10] An Improved Speech Segmentation and Clustering Algorithm Based on SOM and K-Means
Jiang, Nan
Liu, Ting
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020 (2020)

← 1 2 →