ENHANCED TIME-FREQUENCY MASKING BY USING NEURAL NETWORKS FOR MONAURAL SOURCE SEPARATION IN REVERBERANT ROOM ENVIRONMENTS

被引:0
作者
Sun, Yang [1 ]
Wang, Wenwu [2 ]
Chambers, Jonathon A. [1 ]
Naqvi, Syed Mohsen [1 ]
机构
[1] Newcastle Univ, Intelligent Sensing & Commun Res Grp, Newcastle Upon Tyne, Tyne & Wear, England
[2] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England
来源
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2018年
关键词
source separation; reverberant room environments; dereverberation; time-frequency mask; SPEECH; RECOGNITION; NOISE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks (DNNs) have been used for dereverberation and denosing in the monaural source separation problem. However, the performance of current state-of-the-art methods is limited, particularly when applied in highly reverberant room environments. In this paper, we propose an enhanced time-frequency (T-F) mask to improve the separation performance. The ideal enhanced mask (IEM) consists of the dereverberation mask (DM) and the ideal ratio mask (IRM). The DM is specifically applied to eliminate the reverberations in the speech mixture and the IRM helps in denoising. The IEEE and the TIMIT corpora with real room impulse responses (RIRs) and noise from the NOISEX dataset are used to generate speech mixtures for evaluations. The proposed method outperforms the state-of-the-art methods specifically in highly reverberant and noisy room environments.
引用
收藏
页码:1647 / 1651
页数:5
相关论文
共 50 条
  • [1] Two-Stage Monaural Source Separation in Reverberant Room Environments Using Deep Neural Networks
    Sun, Yang
    Wang, Wenwu
    Chambers, Jonathon
    Naqvi, Syed Mohsen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 125 - 139
  • [2] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yu, Yang
    Wang, Wenwu
    Han, Peng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016,
  • [3] Time-frequency masking for blind source separation with preserved spatial cues
    Pirhosseinloo, Shadi
    Kokkinakis, Kostas
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1188 - 1192
  • [4] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yang Yu
    Wenwu Wang
    Peng Han
    EURASIP Journal on Audio, Speech, and Music Processing, 2016
  • [5] Segmented Time-Frequency Masking Algorithm for Speech Separation Based on Deep Neural Networks
    Guo, Xinyu
    Ou, Shifeng
    Gao, Meng
    Gao, Ying
    2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 445 - 450
  • [6] SPATIAL AND COHERENCE CUES BASED TIME-FREQUENCY MASKING FOR BINAURAL REVERBERANT SPEECH SEPARATION
    Alinaghi, Atiyeh
    Wang, Wenwu
    Jackson, Philip J. B.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 684 - 688
  • [7] Cycle GAN-Based Audio Source Separation Using Time-Frequency Masking
    Joseph, Sujo
    Rajan, Rajeev
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (02) : 1163 - 1180
  • [8] Robust TDOA Estimation Based on Time-Frequency Masking and Deep Neural Networks
    Wang, Zhong-Qiu
    Zhang, Xueliang
    Wang, DeLiang
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 322 - 326
  • [9] Unsupervised Learning for Monaural Source Separation Using Maximization-Minimization Algorithm with Time-Frequency Deconvolution
    Woo, Wai Lok
    Gao, Bin
    Bouridane, Ahmed
    Ling, Bingo Wing-Kuen
    Chin, Cheng Siong
    SENSORS, 2018, 18 (05)
  • [10] SEQUENTIALLY TRAINED DNNS BASED MONAURAL SOURCE SEPARATION IN REAL ROOM ENVIRONMENTS
    Li, Yi
    Sun, Yang
    Naqvi, Syed Mohsen
    2019 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE (SSPD), 2019,