WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION

被引：0

作者：

Maciejewski, Matthew ^{[1
,2
]}

Wichern, Gordon ^{[1
]}

McQuinn, Emmett ^{[3
]}

Le Roux, Jonathan ^{[1
]}

机构：

[1] Mitsubishi Elect Res Labs MERL, Cambridge, MA 02139 USA

[2] Johns Hopkins Univ, Baltimore, MD 21218 USA

[3] Whisperai, San Francisco, CA USA

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

speech separation; speech enhancement; cocktail party problem; reverberation;

D O I：

10.1109/icassp40776.2020.9053327

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

While significant advances have been made with respect to the separation of overlapping speech signals, studies have been largely constrained to mixtures of clean, near anechoic speech, not representative of many real-world scenarios. Although the WHAM! dataset introduced noise to the ubiquitous wsj0-2mix dataset, it did not include reverberation, which is generally present in indoor recordings outside of recording studios. The spectral smearing caused by reverberation can result in significant performance degradation for standard deep learning-based speech separation systems, which rely on spectral structure and the sparsity of speech signals to tease apart sources. To address this, we introduce WHAMR!, an augmented version of WHAM! with synthetic reverberated sources, and provide a thorough baseline analysis of current techniques as well as novel cascaded architectures on the newly introduced conditions.

引用

页码：696 / 700

页数：5

共 50 条

[1] GLMSNET: SINGLE CHANNEL SPEECH SEPARATION FRAMEWORK IN NOISY AND REVERBERANT ENVIRONMENTS
Shi, Huiyu
Chen, Xi
Kong, Tianlong
Yin, Shouyi
Ouyang, Peng
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 663 - 670
[2] A comprehensive study on supervised single-channel noisy speech separation with multi-task learning
Dang, Shaoxiang
Matsumoto, Tetsuya
Takeuchi, Yoshinori
Kudo, Hiroaki
SPEECH COMMUNICATION, 2025, 167
[3] TRAINING NOISY SINGLE-CHANNEL SPEECH SEPARATION WITH NOISY ORACLE SOURCES: A LARGE GAP AND A SMALL STEP
Maciejewski, Matthew
Shi, Jing
Watanabe, Shinji
Khudanpur, Sanjeev
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5774 - 5778
[4] SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments
Wang, Liusong
Gao, Yuan
Cao, Kaimin
Hu, Ying
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 44 - 54
[5] Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation
Wang, Zhong-Qiu
Wichern, Gordon
Le Roux, Jonathan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3476 - 3490
[6] Speaker Counting and Separation From Single-Channel Noisy Mixtures
Chetupalli, Srikanth Raj
Habets, Emanuel A. P.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1681 - 1692
[7] Single-Channel Speech Separation Focusing on Attention DE
Li, Xinshu
Tan, Zhenhua
Xia, Zhenche
Wu, Danke
Zhang, Bin
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3204 - 3209
[8] Single-Channel Speech Separation Using Phase-Based Methods
Lee, Yun-Kyung
Lee, In Sung
Kwon, Oh-Wook
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (04) : 2453 - 2459
[9] Deep Clustering in Complex Domain for Single-Channel Speech Separation
Liu, Runling
Tang, Yu
Mang, Hongwei
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1463 - 1468
[10] SINGLE-CHANNEL SPEECH SEPARATION WITH MEMORY-ENHANCED RECURRENT NEURAL NETWORKS
Weninger, Felix
Eyben, Florian
Schuller, Bjoern
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →