Discriminative Nonnegative Dictionary Learning using Cross-Coherence Penalties for Single Channel Source Separation

被引:0
作者
Grais, Emad M. [1 ]
Erdogan, Hakan [1 ]
机构
[1] Sabanci Univ, Fac Engn & Nat Sci, TR-34956 Istanbul, Turkey
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
Single channel source separation; nonnegative matrix factorization; discriminative training; dictionary learning; MATRIX FACTORIZATION; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we introduce a new discriminative training method for nonnegative dictionary learning. The new method can be used in single channel source separation (SCSS) applications. In SCSS, nonnegative matrix factorization (NMF) is used to learn a dictionary (a set of basis vectors) for each source in the magnitude spectrum domain. The trained dictionaries are then used in decomposing the mixed signal to find the estimate for each source. Learning discriminative dictionaries for the source signals can improve the separation performance. To achieve discriminative dictionaries, we try to avoid the bases set of one source dictionary from representing the other source signals. We propose to minimize cross-coherence between the dictionaries of all sources in the mixed signal. We incorporate a simplified cross-coherence penalty using a regularized NMF cost function to simultaneously learn discriminative and reconstructive dictionaries. The new regularized NMF update rules that are used to discriminatively train the dictionaries are introduced in this work. Experimental results show that using discriminative training gives better separation results than using conventional NMF.
引用
收藏
页码:808 / 812
页数:5
相关论文
共 17 条
  • [1] [Anonymous], INT C SPOK LANG PROC
  • [2] Berlin N., 2010, IEEE T AUDIO SPEECH, V18, P538
  • [3] Boulanger-Lewandowski N., 2012, ISMIR
  • [4] Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis
    Fevotte, Cedric
    Bertin, Nancy
    Durrieu, Jean-Louis
    [J]. NEURAL COMPUTATION, 2009, 21 (03) : 793 - 830
  • [5] Grais E. M., 2011, IEEE C SIGN PROC COM
  • [6] Grais E. M., 2011, INT C DIG SIGN PROC
  • [7] Grais E.M., 2013, COMPUTER SPEECH LANG
  • [8] Grais E.M., 2012, ANN C INT SPEECH COM
  • [9] Grais E. M., 2012, IEEE C SIGN PROC COM
  • [10] Grais E. M., 2012, EUR SIGN PROC C EUSI