A SIMPLE AND EFFECTIVE FRAMEWORK FOR A PRIORI SNR ESTIMATION

被引:0
作者
Stahl, Johannes [1 ]
Mowlaee, Pejman [2 ]
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Graz, Austria
[2] Widex AS, Nymollevej 6, DK-3540 Lynge, Denmark
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
基金
奥地利科学基金会;
关键词
speech enhancement; a priori snr; decision-directed; pitch-adaptive; SPECTRAL AMPLITUDE ESTIMATOR; SPEECH ENHANCEMENT;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The problem of estimating the a priori signal-to-noise ratio (SNR) for single-channel speech enhancement is addressed Similar to the decision-directed approach we linearly combine the maximum likelihood estimate of the a priori SNR with an estimate obtained from the previous frame. Based on the harmonic model for voiced speech we propose to smooth the a priori SNR estimate along harmonic trajectories instead of fixed discrete Fourier transform frequency bins. We interpolate by using a pitch-adaptive zero-padding in order to obtain the spectral coefficients at harmonic frequencies. The resulting pitch-adaptive decision-directed (PADDi) method increases the noise attenuation compared to the classical decision-directed approach and outperforms benchmark methods in terms of speech enhancement performance for several noise types at different SNRs, quantified by objective evaluation criteria.
引用
收藏
页码:5644 / 5648
页数:5
相关论文
共 23 条
  • [1] [Anonymous], 2011, ITU T P 835 SUBJ TES
  • [2] [Anonymous], 1992, Technical Report
  • [3] A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing
    Breithaupt, Colin
    Gerkmann, Timo
    Martin, Rainer
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4897 - 4900
  • [4] Analysis of the Decision-Directed SNR Estimator for Speech Enhancement With Respect to Low-SNR and Transient Conditions
    Breithaupt, Colin
    Martin, Rainer
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 277 - 289
  • [5] Elimination of the Musical Noise Phenomenon with the Ephraim and Malah Noise Suppressor
    Cappe, Olivier
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02): : 345 - 349
  • [6] Relaxed statistical model for speech enhancement and a priori SNR estimation
    Cohen, I
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 870 - 881
  • [7] Speech enhancement using a noncausal a priori SNR estimator
    Cohen, I
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (09) : 725 - 728
  • [8] Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation
    Elshamy, Samy
    Madhu, Nilesh
    Tirry, Wouter
    Fingscheidt, Tim
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (08) : 1592 - 1605
  • [9] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445
  • [10] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121