Single Channel Source Separation with General Stochastic Networks

被引:0
作者
Zoehrer, Matthias [1 ]
Pernkopf, Franz [1 ]
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Graz, Austria
来源
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年
基金
奥地利科学基金会;
关键词
general stochastic network; speech separation; speech enhancement; single channel source separation; SPEECH; NOISE; INTELLIGIBILITY; ALGORITHM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single channel source separation (SCSS) is ill-posed and thus challenging. In this paper, we apply general stochastic networks (GSNs) - a deep neural network architecture to SCSS. We extend GSNs to be capable of predicting a time-frequency representation, i.e. softmask by introducing a hybrid generative-discriminative training objective to the network. We evaluate GSNs on data of the 2nd CHiME speech separation challenge. In particular, we provide results for a speaker dependent, a speaker independent, a matched noise condition and an unmatched noise condition task. Empirically, we compare to other deep architectures, namely a deep belief network (DBN) and a multi-layer perceptron (MLP). In general, deep architectures perform well on SCSS tasks.
引用
收藏
页码:978 / 982
页数:5
相关论文
共 30 条
  • [1] [Anonymous], 13061091 ARXIV
  • [2] [Anonymous], 2008, P ICML, DOI 10.1145/1390156.1390294
  • [3] [Anonymous], 2001, ITU T RECOMMENDATION
  • [4] [Anonymous], CORR
  • [5] [Anonymous], P ASRU 2013 AUT SPEE
  • [6] [Anonymous], 2012, IMPROVING NEURAL NET
  • [7] [Anonymous], 2011, INT C ART INT STAT P
  • [8] Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, DOI DOI 10.7551/MITPRESS/7503.003.0024
  • [9] Bengio Y., 2013, P 26 INT C NEURAL IN, V26, P1, DOI DOI 10.48550/ARXIV.1305.6663
  • [10] Berger J., 2010, Proceedings of the Python for Scientific Computing Conference (SciPy), number Scipy, P1