MMSE and maximum a posteriori estimators for speech enhancement in additive noise assuming a t-location-scale clean speech prior

被引:8
作者
Faraji, Neda [1 ]
Kohansal, Akram [2 ]
机构
[1] Imam Khomeini Int Univ, Dept Elect Engn, Qazvin, Iran
[2] Imam Khomeini Int Univ, Dept Stat, Qazvin, Iran
关键词
speech enhancement; least mean squares methods; Gaussian noise; discrete Fourier transforms; probability; Wiener filters; maximum likelihood estimation; MMSE estimator; minimum mean square error; maximum-a-posteriori estimators; additive Gaussian noise; t-location-scale clean speech prior; t-location-scale probability density function; t-location-scale PDF; complex-valued DFT coefficients; clean speech signals; Jensen-Shannon divergence estimator; Wiener filter; Laplacian prior PDF; gamma prior PDF; generalised gamma prior PDF; minimum squared error; signal distortion; perceptual evaluation; speech quality SNR; segmental SNR; general SNR; MOTIVATED BAYESIAN-ESTIMATORS; SPECTRAL AMPLITUDE ESTIMATOR; SQUARE ERROR ESTIMATION;
D O I
10.1049/iet-spr.2017.0446
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The authors derive closed form solutions for the minimum mean square error (MMSE) and maximum a posteriori estimators for speech enhancement in additive Gaussian noise assuming a t-location-scale probability density function (PDF) as clean speech prior. Fitting a t-location-scale PDF to the real and imaginary parts of the discrete fourier transform (DFT) coefficients of clean speech signals demonstrates the lower Jensen-Shannon divergence compared to the other heavy-tailed distributions such as Laplacian and gamma. The authors utilise the two presented estimators along with the Wiener filter and MMSE estimators based on Laplacian, gamma, and generalised gamma prior PDFs to enhance noisy signals from the NOIZEUS database. All the estimators are compared together in terms of both signal and noise distortions. The obtained results show that their proposed MMSE estimator results in the minimum squared error and signal distortion to estimate the complex-valued DFT coefficients of speech. The quality assessments of the enhanced signals are also performed in terms of perceptual evaluation of speech quality, segmental and general SNRs.
引用
收藏
页码:532 / 543
页数:12
相关论文
共 32 条
[1]  
[Anonymous], 1969, IEEE T ACOUST SPEECH, VAU17, P225
[2]  
[Anonymous], 2009, TOOLBOX MMSE ESTIMAT
[3]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[4]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[5]  
Borgström BJ, 2011, INT CONF ACOUST SPEE, P4756
[6]   A Laplacian-based MMSE estimator for speech enhancement [J].
Chen, Bin ;
Loizou, Philipos C. .
SPEECH COMMUNICATION, 2007, 49 (02) :134-143
[7]   A new metric for probability distributions [J].
Endres, DM ;
Schindelin, JE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (07) :1858-1860
[8]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[9]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[10]   Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors [J].
Erkelens, Jan S. ;
Hendriks, Richard C. ;
Heusdens, Richard ;
Jensen, Jesper .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06) :1741-1752