Binaural source separation based on spatial cues and maximum likelihood model adaptation

被引:9
作者
Abdipour, Roohollah [1 ]
Akbari, Ahmad [1 ]
Rahmani, Mohsen [1 ,2 ]
Nasersharif, Babak [1 ,3 ]
机构
[1] Iran Univ Sci & Technol, Sch Comp Engn, Audio & Speech Proc Lab, Tehran, Iran
[2] Arak Univ, Fac Engn, Dept Comp Engn, Arak, Iran
[3] KN Toosi Univ Technol, Dept Elect & Comp Engn, Tehran, Iran
关键词
Binaural source separation; Model adaptation; Maximum likelihood linear regression; Statistical signal processing; Speech enhancement; NONNEGATIVE MATRIX FACTORIZATION; INDEPENDENT COMPONENT ANALYSIS; BLIND SEPARATION; SPEECH; ENHANCEMENT; INFORMATION; STATISTICS; NMF;
D O I
10.1016/j.dsp.2014.09.001
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes a system for separating multiple moving sound sources from two-channel recordings based on spatial cues and a model adaptation technique. We employ a statistical model of observed interaural level and phase differences, where maximum likelihood estimation of model parameters is achieved through an expectation-maximization algorithm. This model is used to partition spectrogram points into several clusters (one cluster per source) and generate spectrogram masks accordingly for isolating individual sound sources. We follow a maximum likelihood linear regression (MLLR) approach for tracking source relocations and adapting model parameters accordingly. The proposed algorithm is able to separate more sources than input channels, i.e. in the underdetermined setting. In simulated anechoic and reverberant environments with two and three speakers, the proposed model-adaptation algorithm yields more than 10 dB gain in signal-to-noise-ratio-improvement for azimuthal source relocations of 15 degrees or more. Moreover, this performance gain is achievable with only 0.6 seconds of input mixture received after relocation. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:174 / 183
页数:10
相关论文
共 40 条
[1]   The CIPICHRTF database [J].
Algazi, VR ;
Duda, RO ;
Thompson, DM ;
Avendano, C .
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :99-102
[2]  
[Anonymous], 2009, P INT C DIG SIG P DS
[3]  
[Anonymous], 2004, Independent component analysis
[4]   Combined approach of array processing and independent component analysis for blind separation of acoustic signals [J].
Asano, F ;
Ikeda, S ;
Ogawa, M ;
Asoh, H ;
Kitawaki, N .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (03) :204-215
[5]   Source separation in systems with correlated sources using NMF [J].
Babji, S. ;
Tangirala, A. K. .
DIGITAL SIGNAL PROCESSING, 2010, 20 (02) :417-432
[6]   AN INFORMATION MAXIMIZATION APPROACH TO BLIND SEPARATION AND BLIND DECONVOLUTION [J].
BELL, AJ ;
SEJNOWSKI, TJ .
NEURAL COMPUTATION, 1995, 7 (06) :1129-1159
[7]  
CARDOSO J.-F., 1993, IEE P F, V140
[8]   Blind signal separation: Statistical principles [J].
Cardoso, JF .
PROCEEDINGS OF THE IEEE, 1998, 86 (10) :2009-2025
[9]  
Comon P, 2010, HANDBOOK OF BLIND SOURCE SEPARATION: INDEPENDENT COMPONENT ANALYSIS AND APPLICATIONS, P1
[10]   INDEPENDENT COMPONENT ANALYSIS, A NEW CONCEPT [J].
COMON, P .
SIGNAL PROCESSING, 1994, 36 (03) :287-314