An adaptive a priori SNR estimator for perceptual speech enhancement

被引：6

作者：

Nahma, Lara ^{[1
]}

Yong, Pei Chee ^{[2
]}

Dam, Hai Huyen ^{[1
]}

Nordholm, Sven ^{[1
]}

机构：

[1] Curtin Univ, Dept Elect Engn Comp & Math Sci, Perth, Australia

[2] Nuheara Ltd, Perth, Australia

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2019年 / 2019卷 / 1期

关键词：

Single-channel speech enhancement; A priori SNR estimation; Decision-directed approach; Adaptive smoothing factor; Auditory system; QUALITY ASSESSMENT; NOISE; SUPPRESSION; PESQ;

D O I：

10.1186/s13636-019-0150-3

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, an adaptive averaging a priori SNR estimation employing critical band processing is proposed. The proposed method modifies the current decision-directed a priori SNR estimation to achieve faster tracking when SNR changes. The decision-directed estimator (DD) employs a fixed weighting with the value close to one, which makes it slow in following the onsets of speech utterances. The proposed SNR estimator provides a means to solve this issue by employing an adaptive weighting factor. This allows an improved tracking of onset changes in the speech signal. As a consequence, it results in better preservation of speech components. This adaptive technique ensures that the weighting between the modified decision-directed a priori estimate and the maximum likelihood a priori estimate is a function of the speech absence probability. The estimate of the speech absence probability is modeled by a sigmoid function. Furthermore, a critical band mapping for the short-time Fourier transform analysis-synthesis system is utilized in the speech enhancement to achieve less musical noise. In addition, to evaluate the ability of the a priori SNR estimation method in preserving speech components, we proposed a modified objective measurement known as modified hamming distance. Evaluations are performed by utilizing both objective and subjective measurements. The experimental results show that the proposed method improves the speech quality under different noise conditions. Moreover, it maintains the advantage of the DD approach in eliminating the musical noise under different SNR conditions. The objective results are supported by subjective listening tests using 10 subjects (5 males and 5 females).

引用

页数：20

共 46 条

[1] Alam MJ, 2009, ISTANB UNIV-J ELECTR, V9, P809
[2] [Anonymous], 1988, Objective measures of speech quality
[3] Beerends JG, 2002, J AUDIO ENG SOC, V50, P765
[4] Benesty J., 2005, Speech Enhancement
[5] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[6] A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing
Breithaupt, Colin
Gerkmann, Timo
Martin, Rainer
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4897 - 4900
[7] Analysis of the Decision-Directed SNR Estimator for Speech Enhancement With Respect to Low-SNR and Transient Conditions
Breithaupt, Colin
Martin, Rainer
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 277 - 289
[8] Elimination of the Musical Noise Phenomenon with the Ephraim and Malah Noise Suppressor
Cappe, Olivier
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02): : 345 - 349
[9] Speech enhancement for non-stationary noise environments
Cohen, I
Berdugo, B
[J]. SIGNAL PROCESSING, 2001, 81 (11) : 2403 - 2418
[10] Davis A., 2006, P 14 EUR C SIGN PROC

← 1 2 3 4 5 →