A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL

被引：0

作者：

Chung, Hanwook ^{[1
]}

Plourde, Eric ^{[2
]}

Champagne, Benoit ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada

[2] Sherbrooke Univ, Dept Elect & Comp Engn, Sherbrooke, PQ, Canada

来源：

2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018) | 2018年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Multi-channel speech enhancement; MVDR beamforming; non-negative matrix factorization; probabilistic generative model; variational Bayesian expectation-maximization; CONVOLUTIVE MIXTURES; ENVIRONMENT;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we introduce a supervised multi-channel speech enhancement algorithm based on a Bayesian multi-channel non-negative matrix factorization (MNMF) model. In the proposed framework, we consider the probabilistic generative model (PGM) of MNMF, specified by Poisson-distributed latent variables and gamma-distributed priors. In the training stage, the MNMF parameters of the speech and noise sources are estimated via the variational Bayesian expectation-maximization (VBEM) algorithm. In the enhancement stage, the clean speech signal is estimated via the MNMF-based minimum variance distortionless response (MVDR) beamformer. To further improve the enhanced speech quality, we efficiently combine the MNMF-based beamforming technique with a classical unsupervised single-channel enhancement method. Experiments show that the proposed method can provide better enhancement performance than the selected benchmarks.

引用

页码：221 / 225

页数：5

共 28 条

[1] Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the Time-Frequency Domain [J].

Badeau, Roland ;

Plumbley, Mark D. .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) :1670-1680

[2]

Bishop Christopher M, 2016, Pattern recognition and machine learning

[3] Multi-source TDOA estimation in reverberant audio using angular spectra and clustering [J].

Blandin, Charles ;

Ozerov, Alexey ;

Vincent, Emmanuel .

SIGNAL PROCESSING, 2012, 92 (08) :1950-1960

[4] Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech [J].

Cauchi, Benjamin ;

Kodrasi, Ina ;

Rehr, Robert ;

Gerlach, Stephan ;

Jukic, Ante ;

Gerkmann, Timo ;

Doclo, Simon ;

Goetze, Stefan .

EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,

[5]

Cemgil Ali Taylan, 2009, Comput Intell Neurosci, P785152, DOI 10.1155/2009/785152

[6] Training and compensation of class-conditioned NMF bases for speech enhancement [J].

Chung, Hanwook ;

Badeau, Roland ;

Plourde, Eric ;

Champagne, Benoit .

NEUROCOMPUTING, 2018, 284 :107-118

[7]

Eggert J., 2008, P INT JOINT C NEUR N, P486

[8] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121

[9]

FOvotte COdric., 2010, International Symposium on Computer Music Modeling and Retrieval, P102

[10] ALGORITHM FOR LINEARLY CONSTRAINED ADAPTIVE ARRAY PROCESSING [J].

FROST, OL .

PROCEEDINGS OF THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, 1972, 60 (08) :926-&

← 1 2 3 →