Single-channel speech enhancement by subspace affinity minimization

被引:3
|
作者
Tran, Dung N. [1 ]
Koishida, Kazuhito [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
来源
INTERSPEECH 2020 | 2020年
关键词
speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;
D O I
10.21437/Interspeech.2020-2982
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.
引用
收藏
页码:2447 / 2451
页数:5
相关论文
共 50 条
  • [42] PHASE ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT USING PHASE INVARIANCE CONSTRAINTS
    Pirolt, Michael
    Stahl, Johannes
    Mowlaee, Pejman
    Vorobiov, Vasili I.
    Barysenka, Siarhei Y.
    Davydov, Andrew G.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5585 - 5589
  • [43] NOISE-ADAPTIVE DEEP NEURAL NETWORK FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
    Chung, Hanwook
    Kim, Taesup
    Plourde, Eric
    Champagne, Benoit
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,
  • [44] Single-channel speech enhancement using inter-component phase relations
    Barysenka, Siarhei Y.
    Vorobiov, Vasili, I
    Mowlaee, Pejman
    SPEECH COMMUNICATION, 2018, 99 : 144 - 160
  • [45] Filtering and Refining: A Collaborative-Style Framework for Single-Channel Speech Enhancement
    Li, Andong
    Zheng, Chengshi
    Yu, Guochen
    Cai, Juanjuan
    Li, Xiaodong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2156 - 2172
  • [46] DDP-Unet: A mapping neural network for single-channel speech enhancement
    Chen, Haoxiang
    Xu, Yanyan
    Ke, Dengfeng
    Su, Kaile
    COMPUTER SPEECH AND LANGUAGE, 2025, 93
  • [47] TFDense-GAN: a generative adversarial network for single-channel speech enhancement
    Chen, Haoxiang
    Zhang, Jinxiu
    Fu, Yaogang
    Zhou, Xintong
    Wang, Ruilong
    Xu, Yanyan
    Ke, Dengfeng
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01):
  • [48] Complex tensor factorization in modulation frequency domain for single-channel speech enhancement
    Masaya, Shogo
    Unoki, Masashi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1765 - 1769
  • [49] Single-Channel Speech Enhancement With Phase Reconstruction Based on Phase Distortion Averaging
    Wakabayashi, Yukoh
    Fukumori, Takahiro
    Nakayama, Masato
    Nishiura, Takanobu
    Yamashita, Yoichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1559 - 1569
  • [50] A Single Channel Subspace Speech Enhancement Approach Based on Optimal Lagrange Multiplier in Time Domain Constraint
    Tu, Jingxian
    Qin, Guijiang
    2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS (ICCCAS 2018), 2018, : 364 - 368