A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition

被引:0
|
作者
Yu, Dong [1 ]
Deng, Li [1 ]
Droppo, Jasha [1 ]
Wu, Ran [1 ]
Gong, Yifan [1 ]
Acero, Alex [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
MMSE estimator; MFCC; noise reduction; robust ASR; speech feature enhancement;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a non-linear feature-domain noise reduction algorithm based on the minimum mean square error (MMSE) criterion on Mel-frequency cepstra (MFCC) for environment-robust speech recognition. Distinguishing from the MMSE enhancement in log spectral amplitude proposed by Ephraim and Malah (E&M) [7], the new algorithm presented in this paper develops the suppression rule that applies to power spectral magnitude of the filter-banks' outputs and to MFCC directly, making it demonstrably more effective in noise-robust speech recognition. The noise variance in the new algorithm contains a significant term resulting from instantaneous phase asynchrony between clean speech and mixing noise, missing in the E&M algorithm. Speech recognition experiments on the standard Aurora-3 task demonstrate a reduction of word error rate by 48% against the ICSLP02 baseline, by 26% against the cepstral mean normalization baseline, and by 13% against the conventional E&M log-MMSE noise suppressor. The new algorithm is also much more efficient than E&M noise suppressor since the number of the channels in the Met-frequency filter bank is much smaller (23 in our case) than the number of bins in the FFT domain (256). The results also show that our algorithm performs slightly better than the ETSI AFE on the well-matched and mid-mismatched settings.
引用
收藏
页码:4041 / 4044
页数:4
相关论文
共 29 条
  • [1] IMPROVED CEPSTRA MINIMUM-MEAN-SQUARE-ERROR NOISE REDUCTION ALGORITHM FOR ROBUST SPEECH RECOGNITION
    Li, Jinyu
    Huang, Yan
    Gong, Yifan
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4865 - 4869
  • [2] IMPROVEMENTS ON MEL-FREQUENCY CEPSTRUM MINIMUM-MEAN-SQUARE-ERROR NOISE SUPPRESSOR FOR ROBUST SPEECH RECOGNITION
    Yu, Dong
    Deng, Li
    Wu, Jian
    Gong, Yifan
    Acero, Alex
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 69 - 72
  • [3] A THEORETICALLY CONSISTENT METHOD FOR MINIMUM MEAN-SQUARE ERROR ESTIMATION OF MEL-FREQUENCY CEPSTRAL FEATURES
    Jensen, Jesper
    Tan, Zheng-Hua
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2014, : 368 - 373
  • [4] Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor
    Yu, Dong
    Deng, Li
    Droppo, Jasha
    Wu, Jian
    Gong, Yian
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05): : 1061 - 1070
  • [5] Minimum-mean-square-error filters for detecting a noisy target in background noise
    Javidi, B
    Parchekani, F
    Zhang, GS
    APPLIED OPTICS, 1996, 35 (35): : 6964 - 6975
  • [6] Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features-A Theoretically Consistent Approach
    Jensen, Jesper
    Tan, Zheng-Hua
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 186 - 197
  • [7] Minimum-mean-square-error filters for detecting a noisy target in background noise
    Javidi, Bahram
    Parchekani, Farokh
    Zhang, Guanshen
    1996, Optical Society of America (35):
  • [8] Speaker Recognition Using Mel-Frequency Cepstrum Coefficients and Sum Square Error
    Charisma, Atik
    Hidayat, M. Reza
    Zainal, Yuda Bakti
    2017 3RD INTERNATIONAL CONFERENCE ON WIRELESS AND TELEMATICS (ICWT), 2017, : 160 - 163
  • [9] Applied mel-frequency discrete wavelet coefficients and parallel model compensation for noise-robust speech recognition
    Tufekei, Zekeriya
    Gowdy, John N.
    Gurbuz, Sabri
    Patterson, Eric
    SPEECH COMMUNICATION, 2006, 48 (10) : 1294 - 1307