Integrating Codebook and Utterance Information in Cepstral Statistics Normalization Techniques for Robust Speech Recognition

被引:0
作者
He, Guan-min
Hung, Jeih-weih
机构
来源
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年
关键词
speech recognition; noise-robust features; codebooks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cepstral statistics normalization techniques have been shown to be very successful at improving the noise robustness of speech features. This paper proposes a hybrid-based scheme to achieve a more accurate estimate of the statistical information of features in these techniques. By properly integrating codebook and utterance knowledge, the resulting hybrid-based approach significantly outperforms conventional utterance-based, segment-based and codebook-based approaches in noisy environments. For the Aurora-2 clean-condition training task, the proposed hybrid codebook/segment-based histogram equalization (CS-HEQ) achieves an average recognition accuracy of 90.66%, which is better than utterance-based HEQ (87.62%), segment-based HEQ (85.92%) and codebook-based HEQ (85.29%). Furthermore, the high-performance CS-HEQ can be implemented with a short delay and can thus be applied in real-time online systems.
引用
收藏
页码:1231 / 1234
页数:4
相关论文
共 12 条
[1]  
CHEN CP, 2002, P INT C SPOK LANG PR
[2]  
DROPPO, 2002, P INT C SPOK LANG PR
[3]  
Du J., 2008, P IEEE INT C AC SPEE
[4]  
FURUI S, 1981, IEEE T ACOUST SPEECH, V29
[5]  
HILGER F, 2001, EUR C SPEECH COMM TE
[6]  
HIRSCH H, 2000, ISCA ITRW ASR
[7]  
HSU CW, 2004, P IEEE INT C AC SPEE
[8]  
HUNG JW, 2008, IEICE T INFORM SYST
[9]  
HUNG JW, 2006, P IEEE INT C AC SPEE
[10]  
TAI CF, 2006, P INT C SPOK LANG PR