SPARSENESS-BASED MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION

被引:0
作者
Higuchi, Takuya [1 ]
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan
来源
2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC) | 2016年
关键词
audio source separation; sparseness; nonnegative matrix factorization; MIXTURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper deals with the problem of audio source separation using multichannel observation. Utilizing the sparseness of sound signals in the time-frequency domain is a successful approach to source separation that enables us to perform separation based on spatial features obtained from a microphone array. On the other hand, nonnegative matrix factorization (NMF) is also a promising approach for audio source separation, which performs separation based on spectral features. This paper incorporates the idea of NMF into sparseness-based source separation and proposes a novel approach to multichannel source separation based on both spatial and spectral features. Experimental results reveal that our proposed method improves the signal-to-distortion ratio (SDR) by 0.26 dB and the signal-to-interference ratio (SIR) by 1.96 dB compared with a conventional sparseness-based approach. In addition, our proposed model eliminates the need for a number of matrix inversions thanks to the sparseness assumption, and thereby requires a much lower computational cost than a previously-proposed multichannel NMF approach, which also utilizes spectral and spatial features.
引用
收藏
页数:5
相关论文
共 14 条
  • [1] BLIND SPARSE SOURCE SEPARATION FOR UNKNOWN NUMBER OF SOURCES USING GAUSSIAN MIXTURE MODEL FITTING WITH DIRICHLET PRIOR
    Araki, Shoko
    Nakatani, Tomohiro
    Sawada, Hiroshi
    Makino, Shoji
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 33 - 36
  • [2] Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis
    Fevotte, Cedric
    Bertin, Nancy
    Durrieu, Jean-Louis
    [J]. NEURAL COMPUTATION, 2009, 21 (03) : 793 - 830
  • [3] Higuchi T, 2016, INT CONF ACOUST SPEE, P5210, DOI 10.1109/ICASSP.2016.7472671
  • [4] Ito N, 2014, 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), P268, DOI 10.1109/IWAENC.2014.6954300
  • [5] Mandel MichaelI., 2007, ADV NEURAL INFORM PR, P953
  • [6] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation
    Ozerov, Alexey
    Fevotte, Cedric
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 550 - 563
  • [7] Sawada H, 2012, INT CONF ACOUST SPEE, P261, DOI 10.1109/ICASSP.2012.6287867
  • [8] Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment
    Sawada, Hiroshi
    Araki, Shoko
    Makino, Shoji
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (03): : 516 - 527
  • [9] Non-negative matrix factorization for polyphonic music transcription
    Smaragdis, P
    Brown, JC
    [J]. 2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 177 - 180
  • [10] Smaragdis P, 2004, LECT NOTES COMPUT SC, V3195, P494