ENHANCED TIME-FREQUENCY MASKING BY USING NEURAL NETWORKS FOR MONAURAL SOURCE SEPARATION IN REVERBERANT ROOM ENVIRONMENTS

被引:0
|
作者
Sun, Yang [1 ]
Wang, Wenwu [2 ]
Chambers, Jonathon A. [1 ]
Naqvi, Syed Mohsen [1 ]
机构
[1] Newcastle Univ, Intelligent Sensing & Commun Res Grp, Newcastle Upon Tyne, Tyne & Wear, England
[2] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England
关键词
source separation; reverberant room environments; dereverberation; time-frequency mask; SPEECH; RECOGNITION; NOISE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks (DNNs) have been used for dereverberation and denosing in the monaural source separation problem. However, the performance of current state-of-the-art methods is limited, particularly when applied in highly reverberant room environments. In this paper, we propose an enhanced time-frequency (T-F) mask to improve the separation performance. The ideal enhanced mask (IEM) consists of the dereverberation mask (DM) and the ideal ratio mask (IRM). The DM is specifically applied to eliminate the reverberations in the speech mixture and the IRM helps in denoising. The IEEE and the TIMIT corpora with real room impulse responses (RIRs) and noise from the NOISEX dataset are used to generate speech mixtures for evaluations. The proposed method outperforms the state-of-the-art methods specifically in highly reverberant and noisy room environments.
引用
收藏
页码:1647 / 1651
页数:5
相关论文
共 50 条
  • [1] Two-Stage Monaural Source Separation in Reverberant Room Environments Using Deep Neural Networks
    Sun, Yang
    Wang, Wenwu
    Chambers, Jonathon
    Naqvi, Syed Mohsen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 125 - 139
  • [2] Blind source separation using time-frequency masking
    Mohammed, Abbas
    Ballal, Tarig
    Grbic, Nedelko
    RADIOENGINEERING, 2007, 16 (04) : 96 - 100
  • [3] Evaluations on underdetermined blind source separation in adverse environments using time-frequency masking
    Ingrid Jafari
    Serajul Haque
    Roberto Togneri
    Sven Nordholm
    EURASIP Journal on Advances in Signal Processing, 2013
  • [4] Evaluations on underdetermined blind source separation in adverse environments using time-frequency masking
    Jafari, Ingrid
    Haque, Serajul
    Togneri, Roberto
    Nordholm, Sven
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2013,
  • [5] MONAURAL SOURCE SEPARATION: FROM ANECHOIC TO REVERBERANT ENVIRONMENTS
    Cord-Landwehr, Tobias
    Boeddeker, Christoph
    Von Neumann, Thilo
    Zorila, Catalin
    Doddipatla, Rama
    Haeb-Umbach, Reinhold
    2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
  • [6] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yang Yu
    Wenwu Wang
    Peng Han
    EURASIP Journal on Audio, Speech, and Music Processing, 2016
  • [7] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yu, Yang
    Wang, Wenwu
    Han, Peng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016,
  • [8] Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking
    Li, Feng
    Qian, Kaizhi
    Hasegawa-Johnson, Mark
    Akagi, Masato
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1239 - 1243
  • [9] Constructing Time-Frequency Dictionaries for Source Separation via Time-Frequency Masking and Source Localisation
    de Frein, Ruairi
    Rickard, Scott T.
    Pearlmutter, Barak A.
    INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2009, 5441 : 573 - +
  • [10] Sound Source Separation by Using Matched Beamforming and Time-Frequency Masking
    Beh, Jounghoon
    Lee, Taekjin
    Han, David
    Ko, Hanseok
    IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010,