Robust Constrained MFMVDR Filters for Single-Channel Speech Enhancement Based on Spherical Uncertainty Set

被引：7

作者：

Fischer, Dorte ^{[1
,2
]}

Doclo, Simon ^{[1
,2
]}

机构：

[1] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26129 Oldenburg, Germany

[2] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4all, D-26129 Oldenburg, Germany

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2021年 / 29卷

关键词：

Multi-Frame MVDR Filter; single-microphone speech enhancement; speech interframe correlation; NOISE-REDUCTION; ERROR; DEREVERBERATION; DOMAIN;

D O I：

10.1109/TASLP.2020.3042013

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Aiming at exploiting speech correlation across consecutive time-frames in the short-time Fourier transform domain, the multi-frame minimum variance distortionless response (MFMVDR) filter for single-channel speech enhancement has been proposed. The MFMVDR filter requires an accurate estimate of the normalized speech correlation vector in order to avoid speech distortion and artifacts. In this paper we investigate the potential of using robust MVDR filtering techniques to estimate the normalized speech correlation vector as the vector maximizing the total signal output power within a spherical uncertainty set, which corresponds to imposing a quadratic inequality constraint. Whereas the singly-constrained (SC) MFMVDR filter only considers the quadratic inequality constraint to estimate the (non-normalized) speech correlation vector, the doubly-constrained (DC) MFMVDR filter integrates a linear normalization constraint into the optimization problem to directly estimate the normalized speech correlation vector. To set the upper bound of the quadratic inequality constraint for each time-frequency point, we propose to use a trained non-linear mapping function that depends on the a-priori signal-to-noise ratio (SNR). Experimental results for different speech signals, noise types and SNRs show that the proposed constrained approaches yield a more accurate estimate of the normalized speech correlation vector than a state-of-the-art maximum-likelihood (ML) estimator. An instrumental and a perceptual evaluation show that both constrained MFMVDR filters lead to less speech and noise distortion but a lower noise reduction than the ML-MFMVDR filter, where the DC-MFMVDR filter is preferred in terms of overall quality compared to the SC-MFMVDR and ML-MFMVDR filters.

引用

页码：618 / 631

页数：14

共 47 条

[1] Robust Speech-Distortion Weighted Interframe Wiener Filters for Single-Channel Noise Reduction [J].

Andersen, Kristian Timm ;

Moonen, Marc .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (01) :97-107

[2]

[Anonymous], 2013, COMPUT REV

[3]

Benesty J, 2012, SPRBRIEF ELECT, P1, DOI 10.1007/978-3-642-23250-3

[4]

Boyd L., 2004, Convex Optimization, DOI DOI 10.1017/CBO9780511804441

[5] A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing [J].

Breithaupt, Colin ;

Gerkmann, Timo ;

Martin, Rainer .

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4897-4900

[6] ROBUST ADAPTIVE BEAMFORMING [J].

COX, H ;

ZESKIND, RM ;

OWEN, MM .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (10) :1365-1376

[7] Multichannel Signal Enhancement Algorithms for Assisted Listening Devices [J].

Doclo, Simon ;

Kellermann, Walter ;

Makino, Shoji ;

Nordholm, Sven .

IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (02) :18-30

[8] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445

[9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121

[10]

Field A., 2018, Discovering Statistics Using SPSS, V5th ed.

← 1 2 3 4 5 →