A novel fast nonstationary noise tracking approach based on MMSE spectral power estimator

被引:17
作者
Zhang, Qiquan [1 ]
Wang, Mingjiang [1 ]
Lu, Yun [1 ]
Zhang, Lu [1 ]
Idrees, Muhammad [1 ]
机构
[1] Harbin Inst Technol, Sch Elect & Informat Engn, Shenzhen 518055, Peoples R China
关键词
Acoustic noise; Noise PSD estimation; Noise suppression; Speech enhancement; MMSE; Speech presence uncertainty; SPEECH ENHANCEMENT; ESTIMATION ALGORITHM; LOW-COMPLEXITY; AMPLITUDE; COEFFICIENTS;
D O I
10.1016/j.dsp.2019.01.019
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Estimating the noise power spectral density (PSD) from the corrupted speech signal is an essential component for speech enhancement algorithms. In this paper, a novel noise PSD estimation algorithm based on minimum mean-square error (MMSE) is proposed. The noise PSD estimate is obtained by recursively smoothing the MMSE estimation of the current noise spectral power. For the noise spectral power estimation, a spectral weighting function is derived, which depends on the a priori signal-to-noise ratio (SNR). Since the speech spectral power is highly important for the a priori SNR estimate, this paper proposes an MMSE spectral power estimator incorporating speech presence uncertainty (SPU) for speech spectral power estimate to improve the a priori SNR estimate. Moreover, a bias correction factor is derived for speech spectral power estimation bias. Then, the estimated speech spectral power is used in "decision-directed" (DD) estimator of the a priori SNR to achieve fast noise tracking. Compared to three state-of-the-art approaches, i.e., minimum statistics (MS), MMSE-based approach, and speech presence probability (SPP)-based approach, it is clear from experimental results that the proposed algorithm exhibits more excellent noise tracking capability under various nonstationary noise environments and SNR conditions. When employed in a speech enhancement system, improved speech enhancement performances in terms of segmental SNR improvements (SSNR+) and perceptual evaluation of speech quality (PESQ) can be observed. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页码:41 / 52
页数:12
相关论文
共 41 条
[1]  
[Anonymous], 2009, INTERSPEECH
[2]   Elimination of the Musical Noise Phenomenon with the Ephraim and Malah Noise Suppressor [J].
Cappe, Olivier .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :345-349
[3]   Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech [J].
Cauchi, Benjamin ;
Kodrasi, Ina ;
Rehr, Robert ;
Gerlach, Stephan ;
Jukic, Ante ;
Gerkmann, Timo ;
Doclo, Simon ;
Goetze, Stefan .
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
[4]   Voice activity detection based on multiple statistical models [J].
Chang, Joon-Hyuk ;
Kim, Nam Soo ;
Mitra, Sanjit K. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (06) :1965-1976
[5]  
Chinaev A, 2017, INT CONF ACOUST SPEE, P4980, DOI 10.1109/ICASSP.2017.7953104
[6]   Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging [J].
Cohen, I .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :466-475
[7]   Noise estimation by minima controlled recursive averaging for robust speech enhancement [J].
Cohen, I ;
Berdugo, B .
IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (01) :12-15
[8]   Speech enhancement for non-stationary noise environments [J].
Cohen, I ;
Berdugo, B .
SIGNAL PROCESSING, 2001, 81 (11) :2403-2418
[9]   Phase-Aware Single-Channel Speech Enhancement With Modulation-Domain Kalman Filtering [J].
Dionelis, Nikolaos ;
Brookes, Mike .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (05) :937-950
[10]  
Doblinger G., 1995, P EUR, P1513