ALLEVIATING THE LOSS-METRIC MISMATCH IN SUPERVISED SINGLE-CHANNEL SPEECH ENHANCEMENT

被引：1

作者：

Yang, Yang ^{[1
,2
,3
]}

Zhang, Hui ^{[1
,2
,3
]}

Zhang, Xueliang ^{[1
,2
,3
]}

Zhang, Huaiwen ^{[1
,2
,3
]}

机构：

[1] Inner Mongolia Univ, Coll Comp Sci, Hohhot, Inner Mongolia, Peoples R China

[2] Natl & Local Joint Engn Res Ctr Intelligent Infor, Hohhot, Inner Mongolia, Peoples R China

[3] Inner Mongolia Key Lab Mongolian Informat Proc Te, Hohhot, Inner Mongolia, Peoples R China

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

基金：

中国国家自然科学基金;

关键词：

Supervised Single-Channel Speech Enhancement; Loss-Metric Mismatch; Function Smoothing; NOISE;

D O I：

10.1109/ICASSP43922.2022.9746915

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we study the loss-metric mismatch problem of supervised single-channel speech enhancement system. Most of the existing speech enhancement systems achieve unsatisfying performance since their empirically selected loss functions have semantic gaps with the non-differentiable evaluation metrics, a.k.a., the loss-metric mismatch problem. In this work, we propose a simple yet efficient method to generate suitable loss functions for the real front-end speech enhancement scenarios to alleviate the loss-metric mismatch problem. Specifically, we adopt the function smoothing technique and approximate the non-differentiable evaluation metrics by a set of basis functions and their linear combination. Experimental results demonstrate that the loss function generated by our method helps the speech enhancement system achieve remarkable performance in most evaluation metrics than the traditional empirically selected ones.

引用

页码：6952 / 6956

页数：5

共 24 条

[1] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[2] Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality
Fu, Szu-Wei
Liao, Chien-Feng
Tsao, Yu
[J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 26 - 30
[3] Huang C, 2019, PR MACH LEARN RES, V97
[4] Kaelbling L. P., 1996, REINFORCEMENT LEARNI
[5] Kolbæk M, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5059, DOI 10.1109/ICASSP.2018.8462040
[6] Kreimer J., 1992, ANN OPER RES, V39, P97, DOI DOI 10.1007/BF02060937
[7] Investigation of Cost Function for Supervised Monaural Speech Separation
Liu, Yun
Zhang, Hui
Zhang, Xueliang
Cao, Yuhang
[J]. INTERSPEECH 2019, 2019, : 3178 - 3182
[8] Loizou P.C., 2013, SPEECH ENHANCEMENT T
[9] Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions
Loizou, Philipos C.
Kim, Gibak
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01): : 47 - 56
[10] Lyons J. W., 1993, DARPA TIMIT acoustic-phonetic continuous speech corpus

← 1 2 3 →