A frequency domain method for blind source separation of convolutive audio mixtures

被引：82

作者：

Rahbar, K ^{[1
]}

Reilly, JP ^{[1
]}

机构：

[1] McMaster Univ, Dept Elect & Comp Engn, Hamilton, ON L8S 4K1, Canada

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 05期

基金：

加拿大自然科学与工程研究理事会;

关键词：

audio enhancement; frequency domain blind; source separation; joint diagonalization; permutation ambiguity;

D O I：

10.1109/TSA.2005.851925

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment. We propose a joint diagonalization procedure on the cross power spectral density matrices of the signals at the output of the mixing system to identify the mixing system at each frequency bin up to a scale and permutation ambiguity. The frequency domain joint diagonalization is performed using a new and quickly converging algorithm which uses an alternating least-squares (ALS) optimization method. The inverse of the mixing system is then used to separate the sources. An efficient dyadic algorithm to resolve the frequency dependent permutation ambiguities that exploits the inherent nonstationarity of the sources is presented. The effect of the unknown scaling ambiguities is partially resolved using an initialization procedure for the ALS algorithm. The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms. Performance comparisons are made with previous methods.

引用

页码：832 / 844

页数：13

共 50 条

[41] Cycle GAN-Based Audio Source Separation Using Time–Frequency Masking
Sujo Joseph
Rajeev Rajan
Circuits, Systems, and Signal Processing, 2023, 42 : 1163 - 1180
[42] BENCHMARKING FLEXIBLE ADAPTIVE TIME-FREQUENCY TRANSFORMS FOR UNDERDETERMINED AUDIO SOURCE SEPARATION
Nesbit, Andrew
Vincent, Emmanuel
Plumbley, Mark D.
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 37 - +
[43] Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation
Saito, Koichi
Nakamura, Tomohiko
Yatabe, Kohei
Saruwatari, Hiroshi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2928 - 2943
[44] A TIKHONOV REGULARIZATION METHOD FOR SPECTRUM DECOMPOSITION IN LOW LATENCY AUDIO SOURCE SEPARATION
Marxer, Ricard
Laney, Lordi
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 277 - 280
[45] A WATERMARKING-BASED METHOD FOR SINGLE-CHANNEL AUDIO SOURCE SEPARATION
Parvaix, Mathieu
Girin, Laurent
Brossier, Jean-Marc
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 101 - +
[46] Spatial blind source separation
Bachoc, Francois
Genton, Marc G.
Nordhausen, Klaus
Ruiz-Gazen, Anne
Virta, Joni
BIOMETRIKA, 2020, 107 (03) : 627 - 646
[47] Real-time blind audio source separation: performance assessment on an advanced digital signal processor
Danilo Pani
Alessandro Pani
Luigi Raffo
The Journal of Supercomputing, 2014, 70 : 1555 - 1576
[48] Real-time blind audio source separation: performance assessment on an advanced digital signal processor
Pani, Danilo
Pani, Alessandro
Raffo, Luigi
JOURNAL OF SUPERCOMPUTING, 2014, 70 (03) : 1555 - 1576
[49] A General Algebraic Algorithm for Blind Extraction of One Source in a MIMO Convolutive Mixture
Dubroca, Remi
De Luigi, Christophe
Castella, Marc
Moreau, Eric
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (05) : 2484 - 2493
[50] Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the Time-Frequency Domain
Badeau, Roland
Plumbley, Mark D.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1670 - 1680

← 1 2 3 4 5 →