Speech enhancement using Bayesian estimators of the perceptually-motivated short-time spectral amplitude (STSA) with Chi speech priors

被引：7

作者：

Trawicki, Marek B. ^{[1
]}

Johnson, Michael T. ^{[1
]}

机构：

[1] Marquette Univ, Dept Elect & Comp Engn, Speech & Signal Proc Lab, Milwaukee, WI 53201 USA

来源：

SPEECH COMMUNICATION | 2014年 / 57卷

关键词：

Speech enhancement; Probability; Amplitude estimation; Phase estimation; Parameter estimation;

D O I：

10.1016/j.specom.2013.09.009

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, the authors propose new perceptually-motivated Weighted Euclidean (WE) and Weighted Cosh (WCOSH) estimators that utilize more appropriate Chi statistical models for the speech prior with Gaussian statistical models for the noise likelihood. Whereas the perceptually-motivated WE and WCOSH cost functions emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects, the incorporation of the Chi distribution statistical models demonstrated distinct improvement over the Rayleigh statistical models for the speech prior. The estimators incorporate both weighting law and shape parameters on the cost functions and distributions. Performance is evaluated in terms of the Segmental Signal-to-Noise Ratio (SSNR), Perceptual Evaluation of Speech Quality (PESQ), and Signal-to-Noise Ratio (SNR) Loss objective quality measures to determine the amount of noise reduction along with overall speech quality and speech intelligibility improvement. Based on experimental results across three different input SNRs and eight unique noises along with various weighting law and shape parameters, the two general, less-complicated, closed-form derived solution estimators of WE and WCOSH with Chi speech priors provide significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements over the baseline WE and WCOSH with the standard Rayleigh speech priors. Overall, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to improvement enhancement. (C) 2013 Elsevier B.V. All rights reserved.

引用

页码：101 / 113

页数：13

共 17 条

[1] Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors
Andrianakis, I.
White, P. R.
[J]. SPEECH COMMUNICATION, 2009, 51 (01) : 1 - 14
[2] [Anonymous], 1969, IEEE T ACOUST SPEECH, VAU17, P225
[3] [Anonymous], SUBJ TEST METH EV SP
[4] [Anonymous], 1993, Continuous Univariate Distributions, DOI DOI 10.1016/0167-9473(96)90015-8
[5] [Anonymous], 2007, Speech Enhancement: Theory and Practice
[6] [Anonymous], 2000, Tables of Integrals
[7] Breithaupt C., 2008, INT C AC SPEECH SIGN
[8] E Papamichalis P., 1987, PRACTICAL APPROACHES
[9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
EPHRAIM, Y
MALAH, D
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445
[10] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
EPHRAIM, Y
MALAH, D
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121

← 1 2 →