SPARSENESS-BASED MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION

被引：0

作者：

Higuchi, Takuya ^{[1
]}

Yoshioka, Takuya ^{[1
]}

Nakatani, Tomohiro ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan

来源：

2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC) | 2016年

关键词：

audio source separation; sparseness; nonnegative matrix factorization; MIXTURES;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper deals with the problem of audio source separation using multichannel observation. Utilizing the sparseness of sound signals in the time-frequency domain is a successful approach to source separation that enables us to perform separation based on spatial features obtained from a microphone array. On the other hand, nonnegative matrix factorization (NMF) is also a promising approach for audio source separation, which performs separation based on spectral features. This paper incorporates the idea of NMF into sparseness-based source separation and proposes a novel approach to multichannel source separation based on both spatial and spectral features. Experimental results reveal that our proposed method improves the signal-to-distortion ratio (SDR) by 0.26 dB and the signal-to-interference ratio (SIR) by 1.96 dB compared with a conventional sparseness-based approach. In addition, our proposed model eliminates the need for a number of matrix inversions thanks to the sparseness assumption, and thereby requires a much lower computational cost than a previously-proposed multichannel NMF approach, which also utilizes spectral and spatial features.

引用

页数：5

共 14 条

[11] Performance measurement in blind audio source separation
Vincent, Emmanuel
Gribonval, Remi
Févotte, Cedric
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1462 - 1469
[12] Monaural sound source separation by nonnegative matrix factorization with tempora continuity and sparseness criteria
Virtanen, Tuomas
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1066 - 1074
[13] Blind separation of speech mixtures via time-frequency masking
Yilmaz, Ö
Rickard, S
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (07) : 1830 - 1847
[14] Yoshioka T, 2015, 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P436, DOI 10.1109/ASRU.2015.7404828

← 1 2 →