Single-channel speech enhancement by subspace affinity minimization

被引:3
|
作者
Tran, Dung N. [1 ]
Koishida, Kazuhito [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
来源
关键词
speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;
D O I
10.21437/Interspeech.2020-2982
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.
引用
收藏
页码:2447 / 2451
页数:5
相关论文
共 50 条
  • [41] Single-channel speech enhancement based on joint constrained dictionary learning
    Sun, Linhui
    Bu, Yunyi
    Li, Pingan
    Wu, Zihao
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [42] Deep Learning with Augmented Kalman Filter for Single-Channel Speech Enhancement
    Roy, Sujan Kumar
    Nicolson, Aaron
    Paliwal, Kuldip K.
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [43] Single-channel speech enhancement in variable noise-level environment
    Lin, CT
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2003, 33 (01): : 137 - 144
  • [44] New Results in Modulation-Domain Single-Channel Speech Enhancement
    Mowlaee, Pejman
    Blass, Martin
    Kleijn, W. Bastiaan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2125 - 2137
  • [45] A Comparative Study on Single-Channel Noise Estimation Methods for Speech Enhancement
    Veisi, Hadi
    Sameti, Hossein
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 645 - 650
  • [46] Modulation-domain Kalman filtering for single-channel speech enhancement
    So, Stephen
    Paliwal, Kuldip K.
    SPEECH COMMUNICATION, 2011, 53 (06) : 818 - 829
  • [47] Biophysically-inspired single-channel speech enhancement in the time domain
    Wen, Chuan
    Verhulst, Sarah
    INTERSPEECH 2023, 2023, : 775 - 779
  • [48] Perceptual Weighting Deep Neural Networks for Single-channel Speech Enhancement
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Zhou, Xingyu
    Zhang, Wei
    PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 446 - 450
  • [49] Phase Based Single-Channel Speech Enhancement Using Phase Ratio
    Singh, Sachin
    Mutawa, A. M.
    Gupta, Monika
    Tripathy, Manoj
    Anand, R. S.
    2017 6TH INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS IN ELECTRICAL ENGINEERING - RECENT ADVANCES (CERA), 2017, : 393 - 396
  • [50] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302