Single-channel speech enhancement by subspace affinity minimization

被引：3

作者：

Tran, Dung N. ^{[1
]}

Koishida, Kazuhito ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

INTERSPEECH 2020 | 2020年

关键词：

speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;

D O I：

10.21437/Interspeech.2020-2982

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.

引用

页码：2447 / 2451

页数：5

共 50 条

[1] Single-channel speech enhancement using learnable loss mixup
Chang, Oscar
Tran, Dung N.
Koishida, Kazuhito
INTERSPEECH 2021, 2021, : 2696 - 2700
[2] Deep Neural Network for Supervised Single-Channel Speech Enhancement
Saleem, Nasir
Irfan Khattak, Muhammad
Ali, Muhammad Yousaf
Shafi, Muhammad
ARCHIVES OF ACOUSTICS, 2019, 44 (01) : 3 - 12
[3] INVESTIGATION OF A PARAMETRIC GAIN APPROACH TO SINGLE-CHANNEL SPEECH ENHANCEMENT
Huang, Gongping
Chen, Jingdong
Benesty, Jacob
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 206 - 210
[4] STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement
Krawczyk, Martin
Gerkmann, Timo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1931 - 1940
[5] Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning
Ghorpade, Kalpana
Khaparde, Arti
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (07) : 4343 - 4361
[6] Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning
Kalpana Ghorpade
Arti Khaparde
Circuits, Systems, and Signal Processing, 2023, 42 : 4343 - 4361
[7] UltraSE: Single-Channel Speech Enhancement Using Ultrasound
Sun, Ke
Zhang, Xinyu
PROCEEDINGS OF THE 27TH ACM ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (ACM MOBICOM '21), 2021, : 160 - 173
[8] Comparative Studies of Single-Channel Speech Enhancement Techniques
Kumar, Bittu
Kumar, Neeraj
Kumar, Manoj
Prasad, S. V. S.
Varma, Ashwini Kumar
Ravi, Banoth
IETE JOURNAL OF RESEARCH, 2024, 70 (06) : 5704 - 5720
[9] Single-Channel Speech Enhancement Using Double Spectrum
Blass, Martin
Mowlaee, Pejman
Kleijn, W. Bastiaan
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1740 - 1744
[10] A spectral conversion approach to single-channel speech enhancement
Mouchtaris, Athanasios
Van der Spiegel, Jan
Mueller, Paul
Tsakalides, Panagiotis
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1180 - 1193

← 1 2 3 4 5 →