TRAINING NOISY SINGLE-CHANNEL SPEECH SEPARATION WITH NOISY ORACLE SOURCES: A LARGE GAP AND A SMALL STEP

被引:4
|
作者
Maciejewski, Matthew [1 ,2 ]
Shi, Jing [1 ,3 ]
Watanabe, Shinji [1 ,2 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
[3] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
speech separation; noisy speech; deep learning;
D O I
10.1109/ICASSP39728.2021.9413975
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise synthetically added to clean speech, preventing the use of in-domain data for a noisy-condition task, or training using mixtures of noisy speech, requiring the network to additionally separate the noise. We demonstrate the relative inseparability of noise and that this noisy speech paradigm leads to significant degradation of system performance. We also propose an SI-SDR-inspired training objective that tries to exploit the inseparability of noise to implicitly partition the signal and discount noise separation errors, enabling the training of better separation systems with noisy oracle sources.
引用
收藏
页码:5774 / 5778
页数:5
相关论文
共 50 条
  • [1] WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION
    Maciejewski, Matthew
    Wichern, Gordon
    McQuinn, Emmett
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 696 - 700
  • [2] Speaker Counting and Separation From Single-Channel Noisy Mixtures
    Chetupalli, Srikanth Raj
    Habets, Emanuel A. P.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1681 - 1692
  • [3] A comprehensive study on supervised single-channel noisy speech separation with multi-task learning
    Dang, Shaoxiang
    Matsumoto, Tetsuya
    Takeuchi, Yoshinori
    Kudo, Hiroaki
    SPEECH COMMUNICATION, 2025, 167
  • [4] Extracting sources from noisy abdominal phonograms: a single-channel blind source separation method
    Jimenez-Gonzalez, A.
    James, C. J.
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2009, 47 (06) : 655 - 664
  • [5] Extracting sources from noisy abdominal phonograms: a single-channel blind source separation method
    A. Jiménez-González
    C. J. James
    Medical & Biological Engineering & Computing, 2009, 47 : 655 - 664
  • [6] One-pass single-channel noisy speech recognition using a combination of noisy and enhanced features
    Fujimoto, Masakiyo
    Kawai, Hisashi
    INTERSPEECH 2019, 2019, : 486 - 490
  • [7] BLIND ROOM VOLUME ESTIMATION FROM SINGLE-CHANNEL NOISY SPEECH
    Genovese, Andrea F.
    Gamper, Hannes
    Pulkki, Ville
    Raghuvanshi, Nikunj
    Tashev, Ivan J.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 231 - 235
  • [8] SINGLE-CHANNEL ENHANCEMENT OF CONVOLUTIVE NOISY SPEECH BASED ON A DISCRIMINATIVE NMF ALGORITHM
    Chung, Hanwook
    Plourde, Eric
    Champagne, Benoit
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2302 - 2306
  • [9] Single-Channel Speech Dereverberation in Noisy Environment for Non-Orthogonal Signals
    Fahim, Abdullah
    Samarasinghe, Prasanga N.
    Abhayapala, Thushara D.
    ACTA ACUSTICA UNITED WITH ACUSTICA, 2018, 104 (06) : 1041 - 1055
  • [10] GLMSNET: SINGLE CHANNEL SPEECH SEPARATION FRAMEWORK IN NOISY AND REVERBERANT ENVIRONMENTS
    Shi, Huiyu
    Chen, Xi
    Kong, Tianlong
    Yin, Shouyi
    Ouyang, Peng
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 663 - 670