Fuzzy Speech Recognition Algorithm Based on Continuous Density Hidden Markov Model and Self Organizing Feature Map

被引：0

作者：

Zhang, Yanning ^{[1
]}

Ma, Lei ^{[1
]}

Li, Yunwei ^{[2
]}

机构：

[1] Beijing Polytech Univ, Telecommun Engn Inst, Beijing 100176, Peoples R China

[2] Beijing Youth Polit Coll, Deans Off, Beijing, Peoples R China

来源：

INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY | 2025年 / 22卷 / 02期

关键词：

Speech recognition; wiener filter; Mel-frequency cepstrum coefficient; continuous hidden Markov model; self- organizing feature neural network; FREQUENCY CEPSTRAL COEFFICIENTS;

D O I：

10.34028/iajit/22/2/11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech recognition refers to the process of receiving and understanding human speech input through a computer, converting it into readable text or instructions. In order to improve the denoising effect and speech recognition effect of fuzzy speech, a fuzzy speech recognition algorithm based on continuous density hidden Markov model and self-organizing feature map is proposed. Firstly, the conventional Wiener filtering algorithm is improved by using the dynamic estimation algorithm of noise power spectrum, and the endpoint detection of noisy speech signal is performed by using spectral entropy, and the noise power spectrum of the silent segment is dynamically updated according to the detection results to obtain a more ideal priori signal to noise ratio; Secondly, the fuzzy speech is input into the Wiener filter to eliminate the noise in the speech signal; then, Mel- Frequency Cepstrum Coefficient (MFCC) of speech signal is extracted as speech feature; Finally, combined with the continuous hidden Markov model and the self-organizing feature neural network in the artificial intelligence algorithm, through the process of adjusting parameters, Viterbi decoding, and the time adjustment of the voice signal in the same state, the speech classification and recognition are realized according to the speech characteristics. In the experiment, comparative experiments were conducted on the LibriSpeech dataset using speech recognition algorithms based on convolutional neural networks and recurrent neural networks, speech recognition algorithms based on residual networks and gated convolutional networks, speech recognition algorithms based on multi-scale Mel domain feature map extraction. The experimental results show that the algorithm has good denoising performance. With the increase of added environmental noise intensity, the algorithm can maintain the Signal-to-Noise Ratio (SNR) of speech signals between 88dB-98dB; This algorithm can accurately detect the sound areas in the signal, and the endpoint detection accuracy is high; The accuracy and recall of the Continuous Density Hidden Markov Model-Self-Organizing Feature Neural Network (CDHMM-SOFM) designed in the algorithm increase with the number of iterations, and the highest levels of accuracy and recall can reach 0.89, respectively; The minimum recognition time of this algorithm is only 8.2 seconds, and the highest recognition rate can reach 98.7%; after applying this algorithm, the user's error rate ranges from 0.0031 to 0.0084. The above results indicate that the algorithm has good application performance.

引用

页数：18

共 50 条

[21] A hidden Markov optimization model for processing and recognition of English speech feature signals
Chen, Yinchun
JOURNAL OF INTELLIGENT SYSTEMS, 2022, 31 (01) : 716 - 725
[22] Fuzzy hidden Markov models for speech and speaker recognition
Tran, D
Wagner, M
18TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1999, : 426 - 430
[23] Accelerating Speech Recognition Algorithm with Synergic Hidden Markov Model and Genetic Algorithm based on Cellular Automata
Mosleh, Mohammad
Setayeshi, Saeed
Kheyrandish, Mohammad
PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2009, : 3 - +
[24] Fuzzy hidden Markov models for speech and speaker recognition
Tran, Dat
Wagner, Michael
Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, 1999, : 426 - 430
[25] Hidden Markov model-based speech emotion recognition
Schuller, B
Rigoll, G
Lang, M
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 1 - 4
[26] LARGE VOCABULARY HIDDEN MARKOV MODEL BASED SPEECH RECOGNITION
RIGOLL, G
EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 1990, 1 (01): : 37 - 42
[27] English speech recognition method based on Hidden Markov model
Lv Cuiling
2016 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2016), 2016, : 94 - 97
[28] Competing hidden Markov models on the Self-Organizing Map
Somervuo, P
IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL III, 2000, : 169 - 174
[29] Hidden Markov model-based speech emotion recognition
Schuller, B
Rigoll, G
Lang, M
2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 401 - 404
[30] KEYWORD DETECTION IN CONVERSATIONAL SPEECH UTTERANCES USING HIDDEN MARKOV MODEL-BASED CONTINUOUS SPEECH RECOGNITION
ROSE, RC
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04): : 309 - 333

← 1 2 3 4 5 →