A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL

被引：0

作者：

Chung, Hanwook ^{[1
]}

Plourde, Eric ^{[2
]}

Champagne, Benoit ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada

[2] Sherbrooke Univ, Dept Elect & Comp Engn, Sherbrooke, PQ, Canada

来源：

2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018) | 2018年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Multi-channel speech enhancement; MVDR beamforming; non-negative matrix factorization; probabilistic generative model; variational Bayesian expectation-maximization; CONVOLUTIVE MIXTURES; ENVIRONMENT;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we introduce a supervised multi-channel speech enhancement algorithm based on a Bayesian multi-channel non-negative matrix factorization (MNMF) model. In the proposed framework, we consider the probabilistic generative model (PGM) of MNMF, specified by Poisson-distributed latent variables and gamma-distributed priors. In the training stage, the MNMF parameters of the speech and noise sources are estimated via the variational Bayesian expectation-maximization (VBEM) algorithm. In the enhancement stage, the clean speech signal is estimated via the MNMF-based minimum variance distortionless response (MVDR) beamformer. To further improve the enhanced speech quality, we efficiently combine the MNMF-based beamforming technique with a classical unsupervised single-channel enhancement method. Experiments show that the proposed method can provide better enhancement performance than the selected benchmarks.

引用

页码：221 / 225

页数：5

共 28 条

[21] Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation [J].

Nikunen, Joonas ;

Virtanen, Tuomas .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (03) :727-739

[22] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation [J].

Ozerov, Alexey ;

Fevotte, Cedric .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :550-563

[23]

Recommendation I.-T., 2001, Rec. ITU-T P.

[24] Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data [J].

Sawada, Hiroshi ;

Kameoka, Hirokazu ;

Araki, Shoko ;

Ueda, Naonori .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (05) :971-982

[25] A Study of the LCMV and MVDR Noise Reduction Filters [J].

Souden, Mehrez ;

Benesty, Jacob ;

Affes, Sofiene .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (09) :4925-4935

[26] Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback-Leibler Divergence [J].

Sun, Meng ;

Li, Yinan ;

Gemmeke, Jort F. ;

Zhang, Xiongwei .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) :1233-1242

[27] Performance measurement in blind audio source separation [J].

Vincent, Emmanuel ;

Gribonval, Remi ;

Févotte, Cedric .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04) :1462-1469

[28] An analysis of environment, microphone and data simulation mismatches in robust speech recognition [J].

Vincent, Emmanuel ;

Watanabe, Shinji ;

Nugraha, Aditya Arie ;

Barker, Jon ;

Marxer, Ricard .

COMPUTER SPEECH AND LANGUAGE, 2017, 46 :535-557

← 1 2 3 →