Cepstral shape normalization (CSN) for robust speech recognition

被引:0
作者
Du, Jun [1 ]
Wang, Ren-Hua [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230027, Peoples R China
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
robust speech recognition; shape normalization;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a new feature normalization approach for robust speech recognition. It is found that the shape of speech feature distributions is changed in noisy environments compared with that in the clean condition. So cepstral shape normalization (CSN) which normalizes the shape of feature distributions is performed by exploiting an exponential factor. This method has been proven effective in noisy environments, especially under low SNRs. Experimental results show that the proposed method yields relative word error rate reductions of 38% and 25% on aurora2 and aurora3 databases, respectively, in comparing with those of the conventional mean and variance normalization (MVN). It is also shown CSN consistently outperforms other traditional methods, such as histogram equalization (HEQ) and higher order cepstral moment normalization (HOCMN).
引用
收藏
页码:4389 / 4392
页数:4
相关论文
共 14 条
  • [1] [Anonymous], P ICSLP
  • [2] [Anonymous], P INTERSPEECH
  • [3] DELATORRE A, 2002, P ICASSP
  • [4] Speech probability distribution
    Gazor, S
    Zhang, W
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (07) : 204 - 207
  • [5] SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY
    GONG, YF
    [J]. SPEECH COMMUNICATION, 1995, 16 (03) : 261 - 291
  • [6] HILGER F, 2001, P EUR, P1135
  • [7] HIRSCH HG, 2000, P ISCA ITRW ASR2000, P181
  • [8] HSU CW, 2004, P ICASSP
  • [9] KOKKINAKIS K, 2005, P ICASSP
  • [10] LIU B, 2004, P ISCSLP, P253