Speech Separation based on Deep Belief Network

被引:0
作者
Wu Haijia [1 ]
Zhang Xiongwei [1 ]
Zhang Liangliang [1 ]
Zou Xia [1 ]
机构
[1] PLA Univ Sci & Technol, Coll Command Informat & Syst, Nanjing 210007, Jiangsu, Peoples R China
来源
PROCEEDINGS OF THE 2015 INTERNATIONAL INDUSTRIAL INFORMATICS AND COMPUTER ENGINEERING CONFERENCE | 2015年
关键词
speech separation; deep learning; deep belief network; restricted Boltzmann machine; autoencoder; SIGNAL; SEGREGATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Thanks to its hierarchical and generative nature, Deep Belief Network (DBN) is effective to feature representation and extraction in signal processing. In this paper, DBN is investigated and implemented to monaural speech separation. Firstly, two separate DBNs are trained to extract features from mixed noisy signals and target clean speech respectively. Subsequently, the two types of extracted features are associated together by training a BP neural network to obtain a mapping from the features of mixed signals to the features of target speech. Finally, by performing DBN and the above mapping neural network, target speech can be estimated from the input mixed signals. Experiments are conducted on different kinds of mixed signals including female/male speech mixtures, human-speech/Gaussian-noise audio mixtures, and human-speech/music audio mixtures. The PESQ scores of the extracted speech are 3.32, 2.59, and 3.42 respectively, which illustrates that the model performs well on speech separation tasks, especially on the mixed signals where the inference signals have obvious spectral structures.
引用
收藏
页码:1486 / 1493
页数:8
相关论文
共 14 条
[11]   Monaural sound source separation by nonnegative matrix factorization with tempora continuity and sparseness criteria [J].
Virtanen, Tuomas .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03) :1066-1074
[12]   An Experimental Study on Speech Enhancement Based on Deep Neural Networks [J].
Xu, Yong ;
Du, Jun ;
Dai, Li-Rong ;
Lee, Chin-Hui .
IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (01) :65-68
[13]   Deep Learning and Its Applications to Signal and Information Processing [J].
Yu, Dong ;
Deng, Li .
IEEE SIGNAL PROCESSING MAGAZINE, 2011, 28 (01) :145-+
[14]   Real-time signal estimation from modified short-time Fourier transform magnitude spectra [J].
Zhu, Xinglei ;
Beauregard, Gerald T. ;
Wyse, Lonce L. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05) :1645-1653