Implementation and performance evaluation of continuous Hindi speech recognition

被引:0
作者
Kuamr, Ankit [1 ]
Dua, Mohit [1 ]
Choudhary, Arun [2 ]
机构
[1] Natl Inst Technol, Dept Comp Engn, Kurukshetra, Haryana, India
[2] Vishveshwarya Inst Tech, Dept Comp Engn, Greater Noida, India
来源
2014 INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS) | 2014年
关键词
Automatic Hindi speech recognition; Continuous Hindi speech recognition; MFCC; PLP; GMM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech to Text recognition is the ability of a machine to recognize the human speech and convert in to text sequence. In this paper, we compare the performance of isolated word, connected word, and continuous speech recognition system with different vocabulary sizes. Hidden Markov Model toolkit HTK 3.4.1 is used to develop the system. For feature extraction, Mel Frequency Cepstral Coefficient (MFCC) and Perceptual Linear Prediction (PLP) both are used in this paper. The aim of this paper is to build a high performance speech recognition system for Hindi language. Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM) are used at the back-end of our proposed system. The system is trained for 100 Hindi words and each word 10 utterances have been recorded for training of the ASR system. The experimental result shows that the overall accuracy of proposed system with 100 word dictionary size is 95.40%, when we use the combination of MFCC and GMM for automatic speech recognition (ASR) system.
引用
收藏
页数:5
相关论文
共 10 条
[1]   Performance evaluation of sequentially combined heterogeneous feature streams for Hindi speech recognition system [J].
Aggarwal, R. K. ;
Dave, M. .
TELECOMMUNICATION SYSTEMS, 2013, 52 (03) :1457-1466
[2]  
Aggarwal R. K., 2011, INT J SPEECH PROCESS, V4
[3]  
Aggarwal R. K., 2010, 2010 1 INT C INT INT, P177
[4]  
[Anonymous], P IJCNLP 2008 WORKSH
[5]  
Becchetti C., SPEECH RECOGNITION T, V2, P121
[6]  
Furui S., 2005, ECTI Transactions On Computer And Information Technology, V1
[7]  
GHAI W, 2013, J SPEECH SCI, V3, P69, DOI DOI 10.1007/978-94-024-0846-1_100175
[8]  
Kumar Kuldeep, 2011, INT J COMPUTING BUSI, V2
[9]  
O'shaughnessy D., 2013, P IEEE, V101
[10]  
Rudnicky A. I., 1994, COMMUNICATION ACM, V37