A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL

被引:0
作者
Chung, Hanwook [1 ]
Plourde, Eric [2 ]
Champagne, Benoit [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
[2] Sherbrooke Univ, Dept Elect & Comp Engn, Sherbrooke, PQ, Canada
来源
2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018) | 2018年
基金
加拿大自然科学与工程研究理事会;
关键词
Multi-channel speech enhancement; MVDR beamforming; non-negative matrix factorization; probabilistic generative model; variational Bayesian expectation-maximization; CONVOLUTIVE MIXTURES; ENVIRONMENT;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we introduce a supervised multi-channel speech enhancement algorithm based on a Bayesian multi-channel non-negative matrix factorization (MNMF) model. In the proposed framework, we consider the probabilistic generative model (PGM) of MNMF, specified by Poisson-distributed latent variables and gamma-distributed priors. In the training stage, the MNMF parameters of the speech and noise sources are estimated via the variational Bayesian expectation-maximization (VBEM) algorithm. In the enhancement stage, the clean speech signal is estimated via the MNMF-based minimum variance distortionless response (MVDR) beamformer. To further improve the enhanced speech quality, we efficiently combine the MNMF-based beamforming technique with a classical unsupervised single-channel enhancement method. Experiments show that the proposed method can provide better enhancement performance than the selected benchmarks.
引用
收藏
页码:221 / 225
页数:5
相关论文
共 28 条
[21]   Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation [J].
Nikunen, Joonas ;
Virtanen, Tuomas .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (03) :727-739
[22]   Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation [J].
Ozerov, Alexey ;
Fevotte, Cedric .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :550-563
[23]  
Recommendation I.-T., 2001, Rec. ITU-T P.
[24]   Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data [J].
Sawada, Hiroshi ;
Kameoka, Hirokazu ;
Araki, Shoko ;
Ueda, Naonori .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (05) :971-982
[25]   A Study of the LCMV and MVDR Noise Reduction Filters [J].
Souden, Mehrez ;
Benesty, Jacob ;
Affes, Sofiene .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (09) :4925-4935
[26]   Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback-Leibler Divergence [J].
Sun, Meng ;
Li, Yinan ;
Gemmeke, Jort F. ;
Zhang, Xiongwei .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) :1233-1242
[27]   Performance measurement in blind audio source separation [J].
Vincent, Emmanuel ;
Gribonval, Remi ;
Févotte, Cedric .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04) :1462-1469
[28]   An analysis of environment, microphone and data simulation mismatches in robust speech recognition [J].
Vincent, Emmanuel ;
Watanabe, Shinji ;
Nugraha, Aditya Arie ;
Barker, Jon ;
Marxer, Ricard .
COMPUTER SPEECH AND LANGUAGE, 2017, 46 :535-557