SINGLE CHANNEL SPEECH ENHANCEMENT USING BAYESIAN NMF WITH RECURSIVE TEMPORAL UPDATES OF PRIOR DISTRIBUTIONS

被引:0
作者
Mohammadiha, Nasser [1 ]
Taghia, Jalil [1 ]
Leijon, Arne [1 ]
机构
[1] KTH Royal Inst Technol, Sound & Image Proc Lab, Stockholm, Sweden
来源
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年
关键词
Speech enhancement; NMF; MMSE; MAP; NONNEGATIVE MATRIX FACTORIZATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
present a speech enhancement algorithm which is based on a Bayesian Nonnegative Matrix Factorization (NMF). Both Minimum Mean Square Error (MMSE) and Maximum a-Posteriori (MAP) estimates of the magnitude of the clean speech DFT coefficients are derived. To exploit the temporal continuity of the speech and noise signals, a proper prior distribution is introduced by widening the posterior distribution of the NMF coefficients at the previous time frames. To do so, a recursive temporal update scheme is proposed to obtain the mean value of the prior distribution; also, the uncertainty of the prior information is governed by the shape parameter of the distribution which is learnt automatically based on the nonstationarity of the signals. Simulations show a considerable improvement compared to the maximum likelihood NMF based speech enhancement algorithm for different input SNRs.
引用
收藏
页码:4561 / 4564
页数:4
相关论文
共 14 条
[1]  
[Anonymous], 2006, Pattern recognition and machine learning
[2]  
Cemgil A. T., 2008, CUEDFINFENGTR609
[3]   An audio-visual corpus for speech perception and automatic speech recognition (L) [J].
Cooke, Martin ;
Barker, Jon ;
Cunningham, Stuart ;
Shao, Xu .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (05) :2421-2424
[4]  
Hendriks R. C., 2010, IEEE INT C ICASSP
[5]  
Hoffman M., 2010, INT C MACH LEARN
[6]  
Kay S., 1993, Fundamentals of statistical processing, volume I: estimation theory, VI
[7]   Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model [J].
Lotter, T ;
Vary, P .
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (07) :1110-1126
[8]  
Mohammadiha N., 2011, MODEL ORDER SELECTIO
[9]  
Mysore G. J., 2011, IEEE INT C ICASSP
[10]   Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation [J].
Ozerov, Alexey ;
Fevotte, Cedric .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :550-563