A frequency domain method for blind source separation of convolutive audio mixtures

被引：82

作者：

Rahbar, K ^{[1
]}

Reilly, JP ^{[1
]}

机构：

[1] McMaster Univ, Dept Elect & Comp Engn, Hamilton, ON L8S 4K1, Canada

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 05期

基金：

加拿大自然科学与工程研究理事会;

关键词：

audio enhancement; frequency domain blind; source separation; joint diagonalization; permutation ambiguity;

D O I：

10.1109/TSA.2005.851925

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment. We propose a joint diagonalization procedure on the cross power spectral density matrices of the signals at the output of the mixing system to identify the mixing system at each frequency bin up to a scale and permutation ambiguity. The frequency domain joint diagonalization is performed using a new and quickly converging algorithm which uses an alternating least-squares (ALS) optimization method. The inverse of the mixing system is then used to separate the sources. An efficient dyadic algorithm to resolve the frequency dependent permutation ambiguities that exploits the inherent nonstationarity of the sources is presented. The effect of the unknown scaling ambiguities is partially resolved using an initialization procedure for the ALS algorithm. The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms. Performance comparisons are made with previous methods.

引用

页码：832 / 844

页数：13

共 50 条

[1] Frequency-domain implementation of a time-domain blind separation algorithm for convolutive mixtures of sources
Ohata, Masashi
Matsuoka, Kiyotoshi
INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2007, 4666 : 528 - +
[2] A Sparsity-Based Method to Solve Permutation Indeterminacy in Frequency-Domain Convolutive Blind Source Separation
Sudhakar, Prasad
Gribonval, Remi
INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2009, 5441 : 338 - 345
[3] AUDIO SOURCE SEPARATION BASED ON CONVOLUTIVE TRANSFER FUNCTION AND FREQUENCY-DOMAIN LASSO OPTIMIZATION
Li, Xiaofei
Girin, Laurent
Horaud, Radu
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 541 - 545
[4] Convolutive blind source separation method based on tensor decomposition
Ma B.
Zhang T.
An Z.
Deng P.
Tongxin Xuebao/Journal on Communications, 2021, 42 (08): : 52 - 60
[5] A Nonlinear Prediction Approach to the Blind Separation of Convolutive Mixtures
Ricardo Suyama
Leonardo Tomazeli Duarte
Rafael Ferrari
Leandro Elias Paiva Rangel
Romis Ribeirode Faissol Attux
Charles Casimiro Cavalcante
Fernando José Von Zuben
João Marcos Travassos Romano
EURASIP Journal on Advances in Signal Processing, 2007
[6] Efficient Frequency Domain Implementation of Noncausal Multichannel Blind Deconvolution for Convolutive Mixtures of Speech
Mirsamadi, Seyedmahdad
Ghaffarzadegan, Shabnam
Sheikhzadeh, Hamid
Ahadi, Seyed Mohammad
Rezaie, Amir Hossein
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2365 - 2377
[7] NEURAL NETWORK ALTERNATIVES TO CONVOLUTIVE AUDIO MODELS FOR SOURCE SEPARATION
Venkataramani, Shrikant
Subakan, Cem
Smaragdis, Paris
2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
[8] Blind source separation for convolutive mixtures based on the joint diagonalization of power spectral density matrices
Mei, Tiemin
Mertins, Alfred
Yin, Fuliang
Xi, Jiangtao
Chicharo, Joe F.
SIGNAL PROCESSING, 2008, 88 (08) : 1990 - 2007
[9] Eliminating the Permutation Ambiguity of Convolutive Blind Source Separation by Using Coupled Frequency Bins
Xie, Kan
Zhou, Guoxu
Yang, Junjie
He, Zhaoshui
Xie, Shengli
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (02) : 589 - 599
[10] Batch and Adaptive PARAFAC-Based Blind Separation of Convolutive Speech Mixtures
Nion, Dimitri
Mokios, Kleanthis N.
Sidiropoulos, Nicholas D.
Potamianos, Alexandros
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1193 - 1207

← 1 2 3 4 5 →