DISCRIMINATIVE DEEP RECURRENT NEURAL NETWORKS FOR MONAURAL SPEECH SEPARATION

被引:0
|
作者
Wang, Guan-Xiang [1 ]
Hsu, Chung-Chien [1 ]
Chien, Jen-Tzung [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu 30010, Taiwan
关键词
deep learning; discriminative learning; neural network; monaural speech separation; FACTORIZATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network is now a new trend towards solving different problems in speech processing. In this paper, we propose a discriminative deep recurrent neural network (DRNN) model for monaural speech separation. Our idea is to construct DRNN as a regression model to discover the deep structure and regularity for signal reconstruction from a mixture of two source spectra. To reinforce the discrimination capability between two separated spectra, we estimate DRNN separation parameters by minimizing an integrated objective function which consists of two measurements. One is the within source reconstruction errors due to the individual source spectra while the other conveys the discrimination information which preserves the mutual difference between two source spectra during the supervised training procedure. This discrimination information acts as a kind of regularization so as to maintain between-source separation in monaural source separation. In the experiments, we demonstrate the effectiveness of the proposed method for speech separation compared with the other methods.
引用
收藏
页码:2544 / 2548
页数:5
相关论文
共 50 条
  • [21] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
    Ochieng, Peter
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL3) : S3651 - S3703
  • [22] Deep Representation-Decoupling Neural Networks for Monaural Music Mixture Separation
    Li, Zhuo
    Wang, Hongwei
    Zhao, Miao
    Li, Wenjie
    Guo, Minyi
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 93 - 100
  • [23] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Weng, Chao
    Yu, Dong
    Watanabe, Shinji
    Juang, Biing-Hwang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] Distilled Binary Neural Network for Monaural Speech Separation
    Chen, Xiuyi
    Liu, Guangcan
    Shi, Jing
    Xu, Jiaming
    Xu, Bo
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [25] A Deep Ensemble Learning Method for Monaural Speech Separation
    Zhang, Xiao-Lei
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (05) : 967 - 977
  • [26] Deep Attractor with Convolutional Network for Monaural Speech Separation
    Lan, Tian
    Qian, Yuxin
    Tai, Wenxin
    Chu, Boce
    Liu, Qiao
    2020 11TH IEEE ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2020, : 40 - 44
  • [27] Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation
    Pyykkonen, Pyry
    Mimilakis, Styliannos, I
    Drossos, Konstantinos
    Virtanen, Tuomas
    2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
  • [28] An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks
    Han, Wei
    Zhang, Xiongwei
    Sun, Meng
    Li, Li
    Shi, Wenhua
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (02) : 718 - 721
  • [29] RECURRENT NEURAL NETWORKS FOR COCHANNEL SPEECH SEPARATION IN REVERBERANT ENVIRONMENTS
    Delfarah, Masood
    Wang, DeLiang
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5404 - 5408
  • [30] Dilated convolutional recurrent neural network for monaural speech enhancement
    Pirhosseinloo, Shadi
    Brumberg, Jonathan S.
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 158 - 162