A General Flexible Framework for the Handling of Prior Information in Audio Source Separation

被引:185
作者
Ozerov, Alexey [1 ]
Vincent, Emmanuel [1 ]
Bimbot, Frederic [2 ]
机构
[1] Rennes Bretagne Atlantique, INRIA, F-35042 Rennes, France
[2] CNRS UMR 6074, IRISA, F-35042 Rennes, France
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2012年 / 20卷 / 04期
关键词
Audio source separation; expectation-maximization; local Gaussian model; nonnegative matrix factorization; NONNEGATIVE MATRIX FACTORIZATION; MAXIMUM-LIKELIHOOD; SIGNAL SEPARATION; SPEECH SEPARATION; BLIND SEPARATION; MODEL; DECONVOLUTION; COVARIANCE; EXTRACTION; MIXTURES;
D O I
10.1109/TASL.2011.2172425
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Most audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper, we introduce a general audio source separation framework based on a library of structured source models that enable the incorporation of prior knowledge about each source via user-specifiable constraints. While this framework generalizes several existing audio source separation methods, it also allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the model structure and constraints, explaining its generality, and summarizing its algorithmic implementation using a generalized expectation-maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different source separation problems. We have released a software tool named Flexible Audio Source Separation Toolbox (FASST) implementing a baseline version of the framework in Matlab.
引用
收藏
页码:1118 / 1133
页数:16
相关论文
共 57 条
[1]  
Abdallah S.A., 2004, P INT C MUS INF RETR, P318
[2]  
[Anonymous], 2001, PROC ICA
[3]  
[Anonymous], 2010, MACHINE AUDITION PRI
[4]   The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation [J].
Araki, Shoko ;
Ozerov, Alexey ;
Gowreesunker, Vikrham ;
Sawada, Hiroshi ;
Theis, Fabian ;
Nolte, Guido ;
Lutter, Dominik ;
Duong, Ngoc Q. K. .
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, 2010, 6365 :114-122
[5]  
Arberet S., 2010, 2010 10th International Conference on Information Sciences, Signal Processing and their Applications (ISSPA 2010), P1, DOI 10.1109/ISSPA.2010.5605570
[6]   A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture [J].
Arberet, Simon ;
Gribonval, Remi ;
Bimbot, Frederic .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (01) :121-133
[7]  
Arberet S, 2009, LECT NOTES COMPUT SC, V5441, P751, DOI 10.1007/978-3-642-00599-2_94
[8]  
Attias H., 2003, P IEEE INT C AC SPEE, P297
[9]   Audio source separation with a single sensor [J].
Benaroya, L ;
Bimbot, F ;
Gribonval, R .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01) :191-199
[10]  
BENAROYA L, 2006, P INT WORKSH AC ECH