Underdetermined Convolutive BSS: Bayes Risk Minimization Based on a Mixture of Super-Gaussian Posterior Approximation

被引：19

作者：

Cho, Janghoon ^{[1
]}

Yoo, Chang D. ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Dept Elect Engn, Taejon 305701, South Korea

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2015年 / 23卷 / 05期

关键词：

Bayesian estimation; blind source separation (BSS); cocktail party problem; underdetermined convolutive mixture; BLIND SOURCE SEPARATION;

D O I：

10.1109/TASLP.2015.2409778

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper considers the underdetermined blind source separation (BSS) of convolutively mixed super-Gaussian signals that include speech, audio, and various other sparse signals. Here, the separation is performed in three steps. In the first and second steps, the mixing matrix and the sources at each time-frequency location are estimated by minimizing the Bayes risk (or the posterior risk) with squared loss. In the final third step, the permutation alignment is conducted by considering the correlation between adjacent spectral bins as in many conventional algorithms. To overcome any computationally intractable integrations involving a complex-valued super-Gaussian source prior, the posterior distribution of the sources is approximated as a mixture of super-Gaussians. The posterior means of the mixing matrix and the sources are obtained with Metropolis-Hastings within Gibbs sampling and the weighted sum of individual super-Gaussians, respectively. Overall, this approximation leads to a separation that is computationally lighter than and as accurate as the algorithm without the approximation. The simulation results of the synthetically generated data in a virtual room with reverberation show that the estimates of the mixing matrix in the first step and the sources in the second step are more accurate than the estimates from the state-of-the-art algorithms in terms of the mixing error ratio (MER) and the signal-to-distortion ratio (SDR). The experiment was also conducted with recorded data in a real room environment using a public benchmark dataset. Results show that the proposed algorithm gives a better performance compared to the state-of-the-art algorithms in terms of the SDR.

引用

页码：828 / 839

页数：12

共 36 条

[1] Blind separation of underdetermined convolutive mixtures using their time-frequency representation [J].

Aissa-El-Bey, Abdeldjalil ;

Abed-Meraim, Karim ;

Grenier, Yves .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05) :1540-1550

[2] Underdetermined blind separation of nondisjoint sources in the time-frequency domain [J].

Aissa-El-Bey, Abdeldjalil ;

Linh-Trung, Nguyen ;

Abed-Meraim, Karim ;

Belouchrani, Adel ;

Grenier, Yves .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (03) :897-907

[3] Adaptive blind signal processing - Neural network approaches [J].

Amari, SI ;

Cichocki, A .

PROCEEDINGS OF THE IEEE, 1998, 86 (10) :2026-2048

[4]

[Anonymous], 2006, ROOM IMPULSE RESPONS

[5]

[Anonymous], EURASIP J APPL SIGNA

[6]

[Anonymous], INDEPENDENT COMPONEN

[7]

Araki Shoko, 2012, Latent Variable Analysis and Signal Separation. Proceedings 10th International Conference, LVA/ICA 2012, P414, DOI 10.1007/978-3-642-28551-6_51

[8] Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors [J].

Araki, Shoko ;

Sawada, Hiroshi ;

Mukai, Ryo ;

Makino, Shoji .

SIGNAL PROCESSING, 2007, 87 (08) :1833-1847

[9] Underdetermined blind source separation using sparse representations [J].

Bofill, P ;

Zibulevsky, M .

SIGNAL PROCESSING, 2001, 81 (11) :2353-2362

[10]

Cho J, 2011, IEEE INT WORKS MACH, DOI 10.1109/mlsp.2011.6064629

← 1 2 3 4 →