Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

被引:2
|
作者
Le, Xiaohuai [1 ,2 ]
Lei, Tong [1 ,3 ]
Chen, Li [2 ]
Guo, Yiqing [2 ]
He, Chao [2 ]
Chen, Cheng [2 ]
Xia, Xianjun [2 ]
Gao, Hua [2 ]
Xiao, Yijian [2 ]
Ding, Piao [2 ]
Song, Shenyi [2 ]
Lu, Jing [1 ,3 ]
机构
[1] Nanjing Univ, Key Lab Modern Acoust, Nanjing 210093, Peoples R China
[2] ByteDance, RTC Lab, Beijing, Peoples R China
[3] Horizon Robot, NJU Horizon Intelligent Audio Lab, Beijing 100094, Peoples R China
来源
INTERSPEECH 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Comb filter; Speech enhancement; PercepNet; DeepFilterNet; NETWORKS;
D O I
10.21437/Interspeech.2023-186
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.
引用
收藏
页码:3894 / 3898
页数:5
相关论文
共 23 条
  • [1] Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement
    Yu, Guochen
    Li, Andong
    Liu, Wenzhe
    Zheng, Chengshi
    Wang, Yutian
    Wang, Hui
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 483 - 487
  • [2] Lightweight Full-band and Sub-band Fusion Network for Real Time Speech Enhancement
    Chen, Zhuangqi
    Zhang, Pingjian
    INTERSPEECH 2022, 2022, : 921 - 925
  • [3] Speech Enhancement Using Harmonic Emphasis and Adaptive Comb Filtering
    Jin, Wen
    Liu, Xin
    Scordilis, Michael S.
    Han, Lu
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02): : 356 - 368
  • [4] DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement
    Yu, Guochen
    Guan, Yuansheng
    Meng, Weixin
    Zheng, Chengshi
    Wang, Hui
    Wang, Yutian
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1382 - 1387
  • [5] On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement
    Mars, Rohith
    Das, Rohan Kumar
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 458 - 462
  • [6] FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT
    Hao, Xiang
    Su, Xiangdong
    Horaud, Radu
    Li, Xiaofei
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6633 - 6637
  • [7] DEEPFILTERNET: A LOW COMPLEXITY SPEECH ENHANCEMENT FRAMEWORK FOR FULL-BAND AUDIO BASED ON DEEP FILTERING
    Schroeter, Hendrik
    Escalante-B, Alberto N.
    Rosenkranz, Tobias
    Maier, Andreas
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7407 - 7411
  • [8] FSI-Net: A dual-stage full- and sub-band integration network for full-band speech enhancement
    Yu, Guochen
    Wang, Hui
    Li, Andong
    Liu, Wenzhe
    Zhang, Yuan
    Wang, Yutian
    Zheng, Chengshi
    APPLIED ACOUSTICS, 2023, 211
  • [9] Light-Weight Self-Attention Augmented Generative Adversarial Networks for Speech Enhancement
    Li, Lujun
    Lu, Zhenxing
    Watzel, Tobias
    Kurzinger, Ludwig
    Rigoll, Gerhard
    ELECTRONICS, 2021, 10 (13)
  • [10] ADAPTIVE-FSN: INTEGRATING FULL-BAND EXTRACTION AND ADAPTIVE SUB-BAND ENCODING FOR MONAURAL SPEECH ENHANCEMENT
    Tsao, Yu-Sheng
    Ho, Kuan-Hsun
    Hung, Jeih-Weih
    Chen, Berlin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 458 - 464