Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

被引:2
|
作者
Le, Xiaohuai [1 ,2 ]
Lei, Tong [1 ,3 ]
Chen, Li [2 ]
Guo, Yiqing [2 ]
He, Chao [2 ]
Chen, Cheng [2 ]
Xia, Xianjun [2 ]
Gao, Hua [2 ]
Xiao, Yijian [2 ]
Ding, Piao [2 ]
Song, Shenyi [2 ]
Lu, Jing [1 ,3 ]
机构
[1] Nanjing Univ, Key Lab Modern Acoust, Nanjing 210093, Peoples R China
[2] ByteDance, RTC Lab, Beijing, Peoples R China
[3] Horizon Robot, NJU Horizon Intelligent Audio Lab, Beijing 100094, Peoples R China
来源
INTERSPEECH 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Comb filter; Speech enhancement; PercepNet; DeepFilterNet; NETWORKS;
D O I
10.21437/Interspeech.2023-186
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.
引用
收藏
页码:3894 / 3898
页数:5
相关论文
共 23 条
  • [21] Noisy speech enhancement using harmonic-noise model and codebook-based post-processing
    Zavarehei, Esfandiar
    Vaseghi, Saeed
    Yan, Qin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1194 - 1203
  • [22] SPEECH ENHANCEMENT USING A MODULATION DOMAIN KALMAN FILTER POST-PROCESSOR WITH A GAUSSIAN MIXTURE NOISE MODEL
    Wang, Yu
    Brookes, Mike
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [23] Wavelet-Based Weighted Low-Rank Sparse Decomposition Model for Speech Enhancement Using Gammatone Filter Bank Under Low SNR Conditions
    Sridhar, K. Venkata
    Kumar, T. Kishore
    FLUCTUATION AND NOISE LETTERS, 2023, 22 (02):