Single channel speech enhancement using iterative constrained NMF based adaptive wiener gain

被引:1
作者
Yechuri, Sivaramakrishna [1 ]
Vanambathina, Sunnydayal [1 ]
机构
[1] VIT AP Univ, SENSE, Amaravati, India
关键词
NMF; Adaptive wiener gain; Inverse nakagami; Erlang; Inverse gamma; Students-t probability density functions; SDR; PESQ; STOI; NONNEGATIVE MATRIX FACTORIZATION; ALGORITHMS; EXTRACTION; MACHINE; FILTER;
D O I
10.1007/s11042-023-16480-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a novel single channel speech enhancement algorithm using iterative constrained Non-negative matrix factorization (NMF) based adaptive Wiener gain for non-stationary noise. In the recent past, NMF-based Wiener filtering methods were used for speech enhancement. The Wiener filter performance depends on the adaptive gain factor value. The adaptive gain factor (alpha) value is constant regardless of noise type and signal to noise ratio (SNR), so it will affect speech enhancement performance. To overcome this, the adaptive factor value is calculated using a genetic algorithm (GA). Here, the GA adjusts the adaptive Wiener gain based on noise type and SNR level. The GA-based adaptive Wiener gain minimizes Wiener filter estimation errors and improves speech quality by adjusting the base vector weights of noise and speech. Additionally, we use the iterative constraints NMF (IC-NMF) method for calculating the priors from noisy speech magnitudes. We select the Erlang, Inverse Gamma, Students-t, and Inverse Nakagami distributions for speech priors and Gaussian distributions for noise priors. Noise and speech samples are well correlated with those distributions. This provides accurate estimation of the necessary statistics of these distributions to regularize the NMF criterion. So, we combine an iterative constrained NMF and a genetic algorithm-based adaptive Wiener filtering method for speech enhancement. The proposed method outperforms other benchmark algorithms in terms of source to distortion ratio (SDR), short-time objective intelligibility (STOI), and perceptual evaluation of speech quality (PESQ).
引用
收藏
页码:26233 / 26254
页数:22
相关论文
共 30 条
[1]  
[Anonymous], 1993, Robotica, DOI DOI 10.1017/S0263574700017136
[2]   Immersive visualization of visual data using nonnegative matrix factorization [J].
Babaee, Mohammadreza ;
Tsoukalas, Stefanos ;
Rigoll, Gerhard ;
Datcu, Mihai .
NEUROCOMPUTING, 2016, 173 :245-255
[3]  
Barnett V., 1975, Applied Linear Statistical Models, VVolume 138
[4]   Algorithms and applications for approximate nonnegative matrix factorization [J].
Berry, Michael W. ;
Browne, Murray ;
Langville, Amy N. ;
Pauca, V. Paul ;
Plemmons, Robert J. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) :155-173
[5]  
Bryan Nicholas., 2013, International Conference on Machine Learning, P208
[6]   Supervised kernel nonnegative matrix factorization for face recognition [J].
Chen, Wen-Sheng ;
Zhao, Yang ;
Pan, Binbin ;
Chen, Bo .
NEUROCOMPUTING, 2016, 205 :165-181
[7]   Generalized Alpha-Beta Divergences and Their Application to Robust Nonnegative Matrix Factorization [J].
Cichocki, Andrzej ;
Cruces, Sergio ;
Amari, Shun-ichi .
ENTROPY, 2011, 13 (01) :134-170
[8]   From blind signal extraction to blind instantaneous signal separation: Criteria, algorithms, and stability [J].
Cruces-Alvarez, SA ;
Cichocki, A ;
Amari, SI .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (04) :859-873
[9]  
Fakhry M, 2018, EUR SIGNAL PR CONF, P16, DOI 10.23919/EUSIPCO.2018.8553123
[10]   Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis [J].
Fevotte, Cedric ;
Bertin, Nancy ;
Durrieu, Jean-Louis .
NEURAL COMPUTATION, 2009, 21 (03) :793-830