Combination of improved Katz and mutual information for speech recognition based on Lattice

被引:1
作者
Zhang Lei [1 ]
Lu Dong [1 ]
Xiang Xue-zhi [1 ]
机构
[1] Harbin Engn Univ, Dept Commun & Informat, Harbin, Heilongjiang Pr, Peoples R China
来源
2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA) | 2010年
关键词
Improved Katz; Mutual Information; Lattice; Speech Recognition;
D O I
10.1109/WCICA.2010.5554331
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the language model based on Chinese, only the number of occurring count of n-gram can not account for the reliability. Some n-grams have strong meaning in Chinese although their occurring count is low in training data. Combined the number of count with mutual information, whether the syllables in n-gram are highly associated can be better described. Further, in most smoothing approaches, the discount idea is widely adopted, that is for less reliable n-grams, the probabilities calculated by training data may be over-estimated, so they should be discounted. In this paper, not only the over-estimated n-grams are discounted, but also the most reliable n-grams which may be under-estimated during training procedure are enhanced. For all these modification is based on Katz which is the most commonly used method in speech recognition, the proposed approach is named as improved Katz. From the experiment results, it can be drawn that the performance of improved Katz is better that only Katz.
引用
收藏
页码:6379 / 6382
页数:4
相关论文
共 7 条
[1]  
[Anonymous], P AAAI WORKSH INT NA
[2]  
Church K. W., 1991, Computer Speech and Language, V5, P19, DOI 10.1016/0885-2308(91)90016-J
[3]   A bit of progress in language modeling [J].
Goodman, JT .
COMPUTER SPEECH AND LANGUAGE, 2001, 15 (04) :403-434
[4]  
JELINEK F, 1980, P WORKSH PATT REC PR, V5, P381
[5]   ESTIMATION OF PROBABILITIES FROM SPARSE DATA FOR THE LANGUAGE MODEL COMPONENT OF A SPEECH RECOGNIZER [J].
KATZ, SM .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (03) :400-401
[6]  
SEIDE F, 2009, SPOK LANG TECHN WORK, P293
[7]   On acoustic diversification front-end for spoken language identification [J].
Sim, Khe Chai ;
Li, Haizhou .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05) :1029-1037