A Comparative Study of Khasi Speech Recognition Systems with Recurrent Neural Network-Based Language Model

被引:0
|
作者
Deepajothi, S. [1 ]
Rao, Vuda Sreenivasa [2 ]
Ambhika, C. [3 ]
Mandala, Vishwanadham [4 ]
Rao, R. V. V. N. Bheema [5 ]
Kumar, Shailendra [6 ]
Gera, Venkateswara Rao [7 ]
Nagaraju, D. [8 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Technol, Kattankulathur 603203, Tamil Nadu, India
[2] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram 522302, Andhra Pradesh, India
[3] RMD Engn Coll, Dept AIML, Rsm Nagar, Kavarapetai, India
[4] Indiana Univ, Bloomington, IN USA
[5] Aditya Coll Engn & Technol, Dept Informat Technol, Surampalem, India
[6] Integral Univ Lucknow, Dept ECE, Lucknow 226026, Uttar Pradesh, India
[7] Kallam Haranadhareddy Inst Technol, Dept ECE, Guntur, India
[8] Sri Venkatesa Perumal Coll Engn & Technol, Dept CSE, Puttur, Andhra Pradesh, India
关键词
Hidden Markov model; Language model; Perceptual linear prediction; Gaussian mixture model; Acoustic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper offers a comparative analysis of Khasi speech recognition systems utilizing a recurrent neural network-based language model (RNN-LM). Develop different acoustic models (AMs) to evaluate the optimal performance. This paper observed that using RNN-LM performed best than traditional other models. The wave surfer performs data processing followed by collecting the recorder based continuous speech database. Moreover, a minimization of word error rate (WER) in 2.83.8% range for major speech data and 2.4-3.5% for minor speech data. Additionally, two acoustic features are used, and from the experimental results, the Mel frequency cepstral coefficient (MFCC) yielded improved performance than the perceptual linear prediction (PLP).
引用
收藏
页码:1296 / 1305
页数:10
相关论文
共 50 条
  • [31] Sign to Speech Convolutional Neural Network-Based Filipino Sign Language Hand Gesture Recognition System
    Jarabese, Mark Benedict D.
    Marzan, Charlie S.
    Boado, Jenelyn Q.
    Lopez, Rushaine Rica Mae F.
    Ofiana, Lady Grace B.
    Pilarca, Kenneth John P.
    2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 147 - 153
  • [32] An efficient hybrid learning algorithm for neural network-based speech recognition systems on FPGA chip
    Pan, Shing-Tai
    Lan, Min-Lun
    NEURAL COMPUTING & APPLICATIONS, 2014, 24 (7-8): : 1879 - 1885
  • [33] A study of neural network Russian language models for automatic continuous speech recognition systems
    Kipyatkova, I. S.
    Karpov, A. A.
    AUTOMATION AND REMOTE CONTROL, 2017, 78 (05) : 858 - 867
  • [34] A study of neural network Russian language models for automatic continuous speech recognition systems
    I. S. Kipyatkova
    A. A. Karpov
    Automation and Remote Control, 2017, 78 : 858 - 867
  • [35] GAUSSIAN PROCESS LSTM RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR SPEECH RECOGNITION
    Lam, Max W. Y.
    Chen, Xie
    Hu, Shoukang
    Yu, Jianwei
    Liu, Xunying
    Meng, Helen
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7235 - 7239
  • [36] Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition
    Masumura, Ryo
    Asami, Taichi
    Oba, Takanobu
    Sakauchi, Sumitaka
    Ito, Akinori
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) : 2557 - 2567
  • [37] Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment
    Deena, Salil
    Hasan, Madina
    Doulaty, Mortaza
    Saz, Oscar
    Hain, Thomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 572 - 582
  • [38] Neural network-based blended ensemble learning for speech emotion recognition
    Yalamanchili, Bhanusree
    Samayamantula, Srinivas Kumar
    Anne, Koteswara Rao
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
  • [39] Neural network-based blended ensemble learning for speech emotion recognition
    Bhanusree Yalamanchili
    Srinivas Kumar Samayamantula
    Koteswara Rao Anne
    Multidimensional Systems and Signal Processing, 2022, 33 : 1323 - 1348
  • [40] Language Model Optimization for a Deep Neural Network Based Speech Recognition System for Serbian
    Pakoci, Edvin
    Popovic, Branislav
    Pekar, Darko
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 483 - 492