Single-channel speech enhancement by subspace affinity minimization

被引：3

作者：

Tran, Dung N. ^{[1
]}

Koishida, Kazuhito ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

INTERSPEECH 2020 | 2020年

关键词：

speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;

D O I：

10.21437/Interspeech.2020-2982

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.

引用

页码：2447 / 2451

页数：5

共 50 条

[31] Biophysically-inspired single-channel speech enhancement in the time domain
Wen, Chuan
Verhulst, Sarah
INTERSPEECH 2023, 2023, : 775 - 779
[32] Phase Based Single-Channel Speech Enhancement Using Phase Ratio
Singh, Sachin
Mutawa, A. M.
Gupta, Monika
Tripathy, Manoj
Anand, R. S.
2017 6TH INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS IN ELECTRICAL ENGINEERING - RECENT ADVANCES (CERA), 2017, : 393 - 396
[33] New Results in Modulation-Domain Single-Channel Speech Enhancement
Mowlaee, Pejman
Blass, Martin
Kleijn, W. Bastiaan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2125 - 2137
[34] A Comparative Study on Single-Channel Noise Estimation Methods for Speech Enhancement
Veisi, Hadi
Sameti, Hossein
2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 645 - 650
[35] Modulation-domain Kalman filtering for single-channel speech enhancement
So, Stephen
Paliwal, Kuldip K.
SPEECH COMMUNICATION, 2011, 53 (06) : 818 - 829
[36] Single-channel speech enhancement using Kalman filtering in the modulation domain
So, Stephen
Wojcicki, Kamil K.
Paliwal, Kuldip K.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 993 - 996
[37] Deep Learning with Augmented Kalman Filter for Single-Channel Speech Enhancement
Roy, Sujan Kumar
Nicolson, Aaron
Paliwal, Kuldip K.
2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[38] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
Taherian, Hassan
Wang, Zhong-Qiu
Chang, Jorge
Wang, DeLiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
[39] Phase-Sensitive Decision-Directed SNR Estimator for Single-Channel Speech Enhancement
Ou, Shifeng
Song, Peng
Gao, Ying
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (08)
[40] NOISE ROBUST EXEMPLAR MATCHING WITH COUPLED DICTIONARIES FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
Yilmaz, Emre
Baby, Deepak
Van Hamme, Hugo
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 874 - 878

← 1 2 3 4 5 →