SPARSENESS-BASED MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION

被引:0
作者
Higuchi, Takuya [1 ]
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan
来源
2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC) | 2016年
关键词
audio source separation; sparseness; nonnegative matrix factorization; MIXTURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper deals with the problem of audio source separation using multichannel observation. Utilizing the sparseness of sound signals in the time-frequency domain is a successful approach to source separation that enables us to perform separation based on spatial features obtained from a microphone array. On the other hand, nonnegative matrix factorization (NMF) is also a promising approach for audio source separation, which performs separation based on spectral features. This paper incorporates the idea of NMF into sparseness-based source separation and proposes a novel approach to multichannel source separation based on both spatial and spectral features. Experimental results reveal that our proposed method improves the signal-to-distortion ratio (SDR) by 0.26 dB and the signal-to-interference ratio (SIR) by 1.96 dB compared with a conventional sparseness-based approach. In addition, our proposed model eliminates the need for a number of matrix inversions thanks to the sparseness assumption, and thereby requires a much lower computational cost than a previously-proposed multichannel NMF approach, which also utilizes spectral and spatial features.
引用
收藏
页数:5
相关论文
共 14 条
  • [11] Performance measurement in blind audio source separation
    Vincent, Emmanuel
    Gribonval, Remi
    Févotte, Cedric
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1462 - 1469
  • [12] Monaural sound source separation by nonnegative matrix factorization with tempora continuity and sparseness criteria
    Virtanen, Tuomas
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1066 - 1074
  • [13] Blind separation of speech mixtures via time-frequency masking
    Yilmaz, Ö
    Rickard, S
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (07) : 1830 - 1847
  • [14] Yoshioka T, 2015, 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P436, DOI 10.1109/ASRU.2015.7404828