Robust Automatic Speech Recognition System for the Recognition of Continuous Kannada Speech Sentences in the Presence of Noise

被引:0
作者
机构
[1] Visvesvaraya Technological University,Department of Electronics and Communication Engineering, Vidyavardhaka College of Engineering
来源
Wireless Personal Communications | 2023年 / 130卷
关键词
Approximation coefficients; Detail coefficients; Monophones; Tri-phones; Deep neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic Speech Recognition system is developed for recognizing the continuous and spontaneous Kannada speech sentences in clean and noisy environments. The language models and acoustic models are constructed using Kaldi toolkit. The speech corpus is developed with the native female and male Kannada speakers and is partioned into training set and testing set. The Performance of the proposed system is analysed and evaluated using the metric Word Error Rate (WER). The Wavelet Packets amalgamated with Mel filter banks are utilized to perform feature vector generation. The proposed hand crafted features perform better than the baseline features such as Perceptual Linear Prediction, Mel Frequency Cepstral Coefficients interms of WER under both clean and nosiy environmental conditions.
引用
收藏
页码:2039 / 2058
页数:19
相关论文
共 76 条
[1]  
Bharali SS(2018)Speech recognition with reference to Assamese language using novel fusion technique International Journal of Speech Technology 21 251-6166
[2]  
Kalita SK(2011)Hindi speech recognition system using HTK International Journal of Computing and Business Research 2 2229-32
[3]  
Kumar K(2012)A Hindi speech recognition system for connected words using HTK International Journal of Computational Systems Engineering 1 25-942
[4]  
Aggarwal RK(2007)Automatic speaker identification using vector quantization Asian Journal of Information Technology 6 938-87
[5]  
Kumar K(2017)Implicit processing of LP residual for language identification Computer Speech and Language 41 68-777
[6]  
Aggarwal RK(2014)An overview of noise-robust automatic speech recognition IEEE/ACM Transactions on Audio, Speech, and Language Processing 22 745-1319
[7]  
Jain A(2018)Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal International Journal of Speech Technology 9 1307-366
[8]  
Bansal P(2017)Agreeing to disagree: Active learning with noisy labels without crowdsourcing International Journal of Machine Learning and Cybernetics 28 357-859
[9]  
Dev A(1980)Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Transactions on Acoustics, Speech, and Signal Processing 08 847-198
[10]  
Jain SB(2010)Wavelet sub-band based temporal features for robust Hindi phoneme recognition International Journal of Wavelets, Multiresolution and Information Processing 8 196-2497