Speech enhancement based on Bayesian decision and spectral amplitude estimation

被引：0

作者：

Feng Deng

Chang-Chun Bao

机构：

[1] Beijing University of Technology,Speech and Audio Signal Processing Lab, School of Electronic Information and Control Engineering

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2015卷

关键词：

Speech enhancement; Bayesian decision; Spectral amplitude estimation; Combined Bayesian risk function; General weighted cost function;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper, a single-channel speech enhancement method based on Bayesian decision and spectral amplitude estimation is proposed, in which the speech detection module and spectral amplitude estimation module are included, and the two modules are strongly coupled. First, under the decisions of speech presence and speech absence, the optimal speech amplitude estimators are obtained by minimizing a combined Bayesian risk function, respectively. Second, using the obtained spectral amplitude estimators, the optimal speech detector is achieved by further minimizing the combined Bayesian risk function. Finally, according to the detection results of speech detector, the optimal decision rule is made and the optimal spectral amplitude estimator is chosen for enhancing noisy speech. Furthermore, by considering both detection and estimation errors, we propose a combined cost function which incorporates two general weighted distortion measures for the speech presence and speech absence of the spectral amplitudes, respectively. The cost parameters in the cost function are employed to balance the speech distortion and residual noise caused by missed detection and false alarm, respectively. In addition, we propose two adaptive calculation methods for the perceptual weighted order p and the spectral amplitude order β concerned in the proposed cost function, respectively. The objective and subjective test results indicate that the proposed method can achieve a more significant segmental signal-noise ratio (SNR) improvement, a lower log-spectral distortion, and a better speech quality than the reference methods.

引用

共 41 条

[1]

Boll SF(1979)Suppression of acoustic noise in speech using spectral subtraction [J] IEEE Trans. Acoust., Speech Signal Process 27 113-120

[2]

Donoho DL(1995)De-noising by soft-thresholding [J] IEEE Trans. Inf. Theory 41 613-627

[3]

Ephraim Y(1995)A signal subspace approach for speech enhancement [J] IEEE Trans. Speech Audio Process. 3 251-266

[4]

Van Trees HL(1999)Single channel speech enhancement based on masking properties of the human auditory system [J] IEEE Trans. Speech Audio Process. 7 126-137

[5]

Virag N(1984)Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator [J] IEEE Trans. Acoust. Speech Signal Process 32 1109-1121

[6]

Ephraim Y(1985)Speech enhancement using a minimum mean-square error log-spectral amplitude estimator [J] IEEE Trans. Acoust. Speech Signal Process 33 443-445

[7]

Malah D(2001)Speech enhancement for non-stationary environments [J] Signal Process. 81 2403-2418

[8]

Ephraim Y(2012)Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection [J] Speech Comm. 54 229-244

[9]

Malah D(2009)Energy-based VAD with grey magnitude spectral subtraction [J] Speech Comm. 51 810-819

[10]

Cohen I(2007)Simultaneous detection and estimation approach for speech enhancement [J] IEEE Trans. Speech Audio Process. 15 2348-2359

← 1 2 3 4 5 →