Robust Automatic Speech Recognition System for the Recognition of Continuous Kannada Speech Sentences in the Presence of Noise

被引:2
作者
Mahadevaswamy [1 ]
机构
[1] Visvesvaraya Technol Univ, Vidyavardhaka Coll Engn, Dept Elect & Commun Engn, Mysuru, Karnataka, India
关键词
Approximation coefficients; Detail coefficients; Monophones; Tri-phones; Deep neural networks; DISCRETE WAVELET TRANSFORM; WORD RECOGNITION; FEATURES;
D O I
10.1007/s11277-023-10371-x
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Automatic Speech Recognition system is developed for recognizing the continuous and spontaneous Kannada speech sentences in clean and noisy environments. The language models and acoustic models are constructed using Kaldi toolkit. The speech corpus is developed with the native female and male Kannada speakers and is partioned into training set and testing set. The Performance of the proposed system is analysed and evaluated using the metric Word Error Rate (WER). The Wavelet Packets amalgamated with Mel filter banks are utilized to perform feature vector generation. The proposed hand crafted features perform better than the baseline features such as Perceptual Linear Prediction, Mel Frequency Cepstral Coefficients interms of WER under both clean and nosiy environmental conditions.
引用
收藏
页码:2039 / 2058
页数:20
相关论文
共 53 条
[1]   Analysis of EEG records in an epileptic patient using wavelet transform [J].
Adeli, H ;
Zhou, Z ;
Dadmehr, N .
JOURNAL OF NEUROSCIENCE METHODS, 2003, 123 (01) :69-87
[2]  
[Anonymous], 2011, Int J Comput Bus Res
[3]  
[Anonymous], 2019, International Journal of Recent Technology and Engineering
[4]  
[Anonymous], 2007, ASIAN J INFORM TECHN
[5]  
[Anonymous], 2000, INT C MULT PROC SYST
[6]  
[Anonymous], 2021, VOICE CONTROLLED IOT
[7]  
Balleda Jyotsana, 2000, INTERSPEECH, P1033
[8]   Speech recognition with reference to Assamese language using novel fusion technique [J].
Bharali S.S. ;
Kalita S.K. .
International Journal of Speech Technology, 2018, 21 (2) :251-263
[9]  
Biswas A., 2015, COMPUT ELECTR ENG
[10]   Admissible wavelet packet sub-band based harmonic energy features using ANOVA fusion techniques for Hindi phoneme recognition [J].
Biswas, Astik ;
Sahu, P. K. ;
Chandra, Mahesh .
IET SIGNAL PROCESSING, 2016, 10 (08) :902-911