Dereverberation and denoising based on generalized spectral subtraction by multi-channel LMS algorithm using a small-scale microphone array

被引:15
作者
Wang, Longbiao [1 ]
Odani, Kyohei [1 ]
Kai, Atsuhiko [1 ]
机构
[1] Shizuoka Univ, Hamamatsu, Shizuoka 4328561, Japan
关键词
hands-free speech recognition; blind dereverberation; multi-channel least mean squares; GSS; missing feature theory;
D O I
10.1186/1687-6180-2012-12
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A blind dereverberation method based on power spectral subtraction (SS) using a multi-channel least mean squares algorithm was previously proposed to suppress the reverberant speech without additive noise. The results of isolated word speech recognition experiments showed that this method achieved significant improvements over conventional cepstral mean normalization (CMN) in a reverberant environment. In this paper, we propose a blind dereverberation method based on generalized spectral subtraction (GSS), which has been shown to be effective for noise reduction, instead of power SS. Furthermore, we extend the missing feature theory (MFT), which was initially proposed to enhance the robustness of additive noise, to dereverberation. A one-stage dereverberation and denoising method based on GSS is presented to simultaneously suppress both the additive noise and nonstationary multiplicative noise (reverberation). The proposed dereverberation method based on GSS with MFT is evaluated on a large vocabulary continuous speech recognition task. When the additive noise was absent, the dereverberation method based on GSS with MFT using only 2 microphones achieves a relative word error reduction rate of 11.4 and 32.6% compared to the dereverberation method based on power SS and the conventional CMN, respectively. For the reverberant and noisy speech, the dereverberation and denoising method based on GSS achieves a relative word error reduction rate of 12.8% compared to the conventional CMN with GSS-based additive noise reduction method. We also analyze the effective factors of the compensation parameter estimation for the dereverberation method based on SS, such as the number of channels (the number of microphones), the length of reverberation to be suppressed, and the length of the utterance used for parameter estimation. The experimental results showed that the SS-based method is robust in a variety of reverberant environments for both isolated and continuous speech recognition and under various parameter estimation conditions.
引用
收藏
页数:11
相关论文
共 21 条
[1]  
Avendano C, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P889, DOI 10.1109/ICSLP.1996.607744
[2]  
AVENDANO C, 1997, P EUR C SPEECH COMM, P1107
[3]   Subspace methods for multimicrophone speech dereverberation [J].
Gannot, S ;
Moonen, M .
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (11) :1074-1090
[4]  
HERMANSKY H, 1995, P INT C AC SPEECH SI, P405
[5]  
Huang Y., 2006, ACOUSTIC MIMO SIGNAL
[6]   Adaptive multi-channel least mean square and Newton algorithms for blind channel identification [J].
Huang, YA ;
Benesty, J .
SIGNAL PROCESSING, 2002, 82 (08) :1127-1138
[7]   Optimal step size of the adaptive multichannel LMS algorithm for blind SIMO identification [J].
Huang, YT ;
Benesty, J ;
Chen, JD .
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (03) :173-176
[8]  
Huang YT, 2002, INT CONF ACOUST SPEE, P1637
[9]  
Itou K., 1999, Journal of the Acoustical Society of Japan (E), V20, P199, DOI 10.1250/ast.20.199
[10]  
Jin Q, 2006, INT CONF ACOUST SPEE, P937