SEQUENTIALLY TRAINED DNNS BASED MONAURAL SOURCE SEPARATION IN REAL ROOM ENVIRONMENTS

被引:0
|
作者
Li, Yi [1 ]
Sun, Yang [1 ]
Naqvi, Syed Mohsen [1 ]
机构
[1] Newcastle Univ, Sch Engn, Intelligent Sensing & Commun Grp, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
关键词
Deep neural networks; monaural source separation; dereverbertation mask; sequentially; FALL DETECTION SYSTEM; SPEECH SEPARATION; RECOGNITION; MASKING; NOISE;
D O I
10.1109/sspd.2019.8751658
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent studies, deep neural networks (DNN) have been introduced to solve monaural source separation (MSS) problem within real room environments. However, the separation performance of the existing methods is limited, especially for environments with larger RT60s. In this paper, we propose a system to train two DNNs sequentially, to mitigate the challenge and improve the separation performance. Our dereverberation mask (DM) is exploited as a training target for DNN1 and new enhanced ratio mask (ERM) is used as a training target for DNN2. The IEEE and the TIMIT corpora with real room impulse responses and noise interferences from the NOISEX dataset are used to generate speech mixtures for evaluations. The proposed method outperforms the state-of-the-art methods.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs
    Tambe, Thierry
    Yang, En-Yu
    Ko, Glenn G.
    Chai, Yuji
    Hooper, Coleman
    Donato, Marco
    Whatmough, Paul N.
    Rush, Alexander M.
    Brooks, David
    Wei, Gu-Yeon
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2023, 58 (02) : 569 - 581
  • [42] Overdetermined blind source separation of real acoustic sounds based on multistage ICA using subarray processing
    Nishikawa, T
    Abe, H
    Saruwatari, H
    Shikano, K
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 510 - 513
  • [43] Real-time sound source localization and separation based on active audio-visual integration
    Okuno, HG
    Nakadai, K
    COMPUTATIONAL METHODS IN NEURAL MODELING, PT 1, 2003, 2686 : 118 - 125
  • [44] ICARUS - SOURCE GENERATOR BASED REAL-TIME RECOGNITION OF SPEECH IN NOISY STRESSFUL AND LOMBARD EFFECT ENVIRONMENTS
    HANSEN, JHL
    CAIRNS, DA
    SPEECH COMMUNICATION, 1995, 16 (04) : 391 - 422
  • [45] Blind source separation-based IVA-Xception model for bird sound recognition in complex acoustic environments
    Dai, Yusheng
    Yang, Jin
    Dong, Yiwei
    Zou, Haipeng
    Hu, Mingzhi
    Wang, Bin
    ELECTRONICS LETTERS, 2021, 57 (11) : 454 - 456
  • [46] Blind source separation based on fast-convergence algorithm using ICA and beamforming for real convolutive mixture
    Saruwatari, H
    Kawamura, T
    Sawai, K
    Kaminuma, A
    Sakata, M
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 921 - 924
  • [47] A single-chip FPGA design for real-time ICA-based blind source separation algorithm
    Charoensak, C
    Sattar, F
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 5822 - 5825
  • [48] Real time De-mixing system based on LMS adaptive algorithm for blind two source signals separation
    Mehrkanoon, Saeid
    Moghavvemi, Mahmoud
    Fariborzi, Hossein
    2007 5TH STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT, 2007, : 229 - 234
  • [49] Inference-Adaptive Steering of Neural Networks for Real-Time Area-Based Sound Source Separation
    Strauss, Martin
    Mack, Wolfgang
    Valero, Maria Luis
    Koepueklue, Okan
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1041 - 1045
  • [50] REAL-TIME SPEECH ENHANCEMENT IN NOISY REVERBERANT MULTI-TALKER ENVIRONMENTS BASED ON A LOCATION-INDEPENDENT ROOM ACOUSTICS MODEL
    Nakatani, Tomohiro
    Yoshioka, Takuya
    Kinoshita, Keisuke
    Miyoshi, Masato
    Juang, Biing-Hwang
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 137 - 140