Gaussian Mixture Optimization for HMM based on Efficient Cross-validation

被引:0
作者
Shinozaki, Takahiro [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Acad Ctr Comp & Media Studies, Kyoto, Japan
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
speech recognition; HMM; Gaussian mixture; cross-validation; sufficient statistics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Gaussian mixture optimization method is explored using cross-validation likelihood as an objective function instead of the conventional training set likelihood. The optimization is based on reducing the number of mixture components by selecting and merging a pair of Gaussians step by step base on the objective function so as to remove redundant components and improve the generality of the model. Cross-validation likelihood is more appropriate for avoiding over-fitting than the conventional likelihood and can be efficiently computed using sufficient statistics. It results in a better Gaussian pair selection and provides a termination criterion that does not rely on empirical thresholds. Large-vocabulary speech recognition experiments on oral presentations show that the cross-validation method gives a smaller word error rate with an automatically determined model size than a baseline training procedure that does not perform the optimization.
引用
收藏
页码:653 / 656
页数:4
相关论文
共 9 条
[1]  
[Anonymous], P EUR 1997
[2]  
[Anonymous], P SSPR 2003
[3]  
CINCAREK T, 2007, IEEE T AUDIO SPEECH, V15, P150
[4]  
LEE A, 1998, P ICSLP, P1831
[5]   HMM topology design using maximum likelihood successive state splitting [J].
Ostendorf, M ;
Singer, H .
COMPUTER SPEECH AND LANGUAGE, 1997, 11 (01) :17-41
[6]  
SHINOZAKI T, 2006, P ICASSP, V1, P1157
[7]  
SHINOZAKI T, 2007, P ICASSP, P437
[8]  
Young S., 2005, HTK BOOK
[9]  
Young S. J., 1994, P WORKSH HUM LANG TE, P307, DOI [DOI 10.3115/1075812.1075885, 10.3115/1075812.1075885]