Gaussian mixture language models for speech recognition

被引:0
作者
Afify, Mohamed [1 ]
Siohan, Olivier [1 ]
Sarikaya, Ruhi [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, 1101 Old Kitchawan Rd, Yorktown Hts, NY 10598 USA
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年
关键词
language model; N-gram; Gaussian mixture model; continuous space;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a Gaussian mixture language model for speech recognition. Two potential benefits of using this model are smoothing unseen events, and ease of adaptation. It is shown how this model can be used alone or in conjunction with a a conventional N-gram model to calculate word probabilities. An interesting feature of the proposed technique is that many methods developed for acoustic models can be easily ported to GMLM. We developed two implementations of the proposed model for large vocabulary Arabic speech recognition with results comparable to conventional N-gram.
引用
收藏
页码:29 / +
页数:2
相关论文
共 9 条
[1]   Large vocabulary speech recognition with multispan statistical language models [J].
Bellegarda, JR .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01) :76-84
[2]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[3]  
Duda R., 2000, PATTERN CLASIFICATIO
[4]  
GAO Y, 2006, P ICASSP 06 TOL FRAN
[5]  
Golub G. H., 1996, MATRIX COMPUTATIONS
[6]  
Jelinek F., 1998, Statistical Methods for Speech Recognition
[7]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185
[8]  
SCHWENK H, 2003, IEEE WORKSH SPONT SP
[9]  
STOLCKE A, 2002, P ICSLP 02 DENV COL