A frequency domain method for blind source separation of convolutive audio mixtures

被引:82
作者
Rahbar, K [1 ]
Reilly, JP [1 ]
机构
[1] McMaster Univ, Dept Elect & Comp Engn, Hamilton, ON L8S 4K1, Canada
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 05期
基金
加拿大自然科学与工程研究理事会;
关键词
audio enhancement; frequency domain blind; source separation; joint diagonalization; permutation ambiguity;
D O I
10.1109/TSA.2005.851925
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment. We propose a joint diagonalization procedure on the cross power spectral density matrices of the signals at the output of the mixing system to identify the mixing system at each frequency bin up to a scale and permutation ambiguity. The frequency domain joint diagonalization is performed using a new and quickly converging algorithm which uses an alternating least-squares (ALS) optimization method. The inverse of the mixing system is then used to separate the sources. An efficient dyadic algorithm to resolve the frequency dependent permutation ambiguities that exploits the inherent nonstationarity of the sources is presented. The effect of the unknown scaling ambiguities is partially resolved using an initialization procedure for the ALS algorithm. The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms. Performance comparisons are made with previous methods.
引用
收藏
页码:832 / 844
页数:13
相关论文
共 50 条
  • [41] Cycle GAN-Based Audio Source Separation Using Time–Frequency Masking
    Sujo Joseph
    Rajeev Rajan
    Circuits, Systems, and Signal Processing, 2023, 42 : 1163 - 1180
  • [42] BENCHMARKING FLEXIBLE ADAPTIVE TIME-FREQUENCY TRANSFORMS FOR UNDERDETERMINED AUDIO SOURCE SEPARATION
    Nesbit, Andrew
    Vincent, Emmanuel
    Plumbley, Mark D.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 37 - +
  • [43] Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation
    Saito, Koichi
    Nakamura, Tomohiko
    Yatabe, Kohei
    Saruwatari, Hiroshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2928 - 2943
  • [44] A TIKHONOV REGULARIZATION METHOD FOR SPECTRUM DECOMPOSITION IN LOW LATENCY AUDIO SOURCE SEPARATION
    Marxer, Ricard
    Laney, Lordi
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 277 - 280
  • [45] A WATERMARKING-BASED METHOD FOR SINGLE-CHANNEL AUDIO SOURCE SEPARATION
    Parvaix, Mathieu
    Girin, Laurent
    Brossier, Jean-Marc
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 101 - +
  • [46] Spatial blind source separation
    Bachoc, Francois
    Genton, Marc G.
    Nordhausen, Klaus
    Ruiz-Gazen, Anne
    Virta, Joni
    BIOMETRIKA, 2020, 107 (03) : 627 - 646
  • [47] Real-time blind audio source separation: performance assessment on an advanced digital signal processor
    Danilo Pani
    Alessandro Pani
    Luigi Raffo
    The Journal of Supercomputing, 2014, 70 : 1555 - 1576
  • [48] Real-time blind audio source separation: performance assessment on an advanced digital signal processor
    Pani, Danilo
    Pani, Alessandro
    Raffo, Luigi
    JOURNAL OF SUPERCOMPUTING, 2014, 70 (03) : 1555 - 1576
  • [49] A General Algebraic Algorithm for Blind Extraction of One Source in a MIMO Convolutive Mixture
    Dubroca, Remi
    De Luigi, Christophe
    Castella, Marc
    Moreau, Eric
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (05) : 2484 - 2493
  • [50] Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the Time-Frequency Domain
    Badeau, Roland
    Plumbley, Mark D.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (11) : 1670 - 1680