Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

被引：2

作者：

Le, Xiaohuai ^{[1
,2
]}

Lei, Tong ^{[1
,3
]}

Chen, Li ^{[2
]}

Guo, Yiqing ^{[2
]}

He, Chao ^{[2
]}

Chen, Cheng ^{[2
]}

Xia, Xianjun ^{[2
]}

Gao, Hua ^{[2
]}

Xiao, Yijian ^{[2
]}

Ding, Piao ^{[2
]}

Song, Shenyi ^{[2
]}

Lu, Jing ^{[1
,3
]}

机构：

[1] Nanjing Univ, Key Lab Modern Acoust, Nanjing 210093, Peoples R China

[2] ByteDance, RTC Lab, Beijing, Peoples R China

[3] Horizon Robot, NJU Horizon Intelligent Audio Lab, Beijing 100094, Peoples R China

来源：

INTERSPEECH 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Comb filter; Speech enhancement; PercepNet; DeepFilterNet; NETWORKS;

D O I：

10.21437/Interspeech.2023-186

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.

引用

页码：3894 / 3898

页数：5

共 23 条

[21] Noisy speech enhancement using harmonic-noise model and codebook-based post-processing
Zavarehei, Esfandiar
Vaseghi, Saeed
Yan, Qin
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1194 - 1203
[22] SPEECH ENHANCEMENT USING A MODULATION DOMAIN KALMAN FILTER POST-PROCESSOR WITH A GAUSSIAN MIXTURE NOISE MODEL
Wang, Yu
Brookes, Mike
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[23] Wavelet-Based Weighted Low-Rank Sparse Decomposition Model for Speech Enhancement Using Gammatone Filter Bank Under Low SNR Conditions
Sridhar, K. Venkata
Kumar, T. Kishore
FLUCTUATION AND NOISE LETTERS, 2023, 22 (02):

← 1 2 3 →