Single Channel Source Separation with General Stochastic Networks

被引：0

作者：

Zoehrer, Matthias ^{[1
]}

Pernkopf, Franz ^{[1
]}

机构：

[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Graz, Austria

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

基金：

奥地利科学基金会;

关键词：

general stochastic network; speech separation; speech enhancement; single channel source separation; SPEECH; NOISE; INTELLIGIBILITY; ALGORITHM;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Single channel source separation (SCSS) is ill-posed and thus challenging. In this paper, we apply general stochastic networks (GSNs) - a deep neural network architecture to SCSS. We extend GSNs to be capable of predicting a time-frequency representation, i.e. softmask by introducing a hybrid generative-discriminative training objective to the network. We evaluate GSNs on data of the 2nd CHiME speech separation challenge. In particular, we provide results for a speaker dependent, a speaker independent, a matched noise condition and an unmatched noise condition task. Empirically, we compare to other deep architectures, namely a deep belief network (DBN) and a multi-layer perceptron (MLP). In general, deep architectures perform well on SCSS tasks.

引用

页码：978 / 982

页数：5

共 30 条

[1] [Anonymous], 13061091 ARXIV
[2] [Anonymous], 2008, P ICML, DOI 10.1145/1390156.1390294
[3] [Anonymous], 2001, ITU T RECOMMENDATION
[4] [Anonymous], CORR
[5] [Anonymous], P ASRU 2013 AUT SPEE
[6] [Anonymous], 2012, IMPROVING NEURAL NET
[7] [Anonymous], 2011, INT C ART INT STAT P
[8] Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, DOI DOI 10.7551/MITPRESS/7503.003.0024
[9] Bengio Y., 2013, P 26 INT C NEURAL IN, V26, P1, DOI DOI 10.48550/ARXIV.1305.6663
[10] Berger J., 2010, Proceedings of the Python for Scientific Computing Conference (SciPy), number Scipy, P1

← 1 2 3 →