Pseudo-Determined Blind Source Separation for Ad-hoc Microphone Networks

被引：13

作者：

Wang, Lin ^{[1
]}

Cavallaro, Andrea ^{[1
]}

机构：

[1] Queen Mary Univ London, Ctr Intelligent Sensing, London E1 4NS, England

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2018年 / 26卷 / 05期

基金：

英国工程与自然科学研究理事会;

关键词：

Ad-hoc; asynchronous recording; blind source separation; over-determined mixture; PERMUTATION ALIGNMENT; SPEECH; ALGORITHMS; MIXTURES;

D O I：

10.1109/TASLP.2018.2803263

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose a pseudo-determined blind source separation framework that exploits the information from a large number of microphones in an ad-hoc network to extract and enhance sound sources in a reverberant scenario. After compensating for the time offsets and sampling rate mismatch between (asynchronous) signals, we interpret as a determined M x M mixture the over-determined M x N mixture, where M x N is the number of microphones and N is the number of sources. Next, we propose a pseudodetermined mixture model that can apply an M x M independent component analysis (ICA) directly to the M-channel recordings. Moreover, we propose a reference-based permutation alignment scheme that aligns the permutation of the ICA outputs and classifies them into target channels, which contain the N sources, and nontarget channels, which contain reverberation residuals. Finally, using the signals from nontarget channels, we estimate in each target channel the power spectral density of the noise component that we suppress with a spectral postfilter. Interestingly, we also obtain late-reverberation suppression as by-product. Experiments show that each processing block improves incrementally source separation and that the performance of the proposed pseudodetermined separation improves as the number of microphones increases.

引用

页码：981 / 994

页数：14

共 45 条

[1] IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].

ALLEN, JB ;

BERKLEY, DA .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950

[2]

[Anonymous], P INT WORKSH IND COM

[3]

[Anonymous], P INTERSPEECH

[4] Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures [J].

Araki, S ;

Makino, S ;

Hinamoto, Y ;

Mukai, R ;

Nishikawa, T ;

Saruwatari, H .

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (11) :1157-1166

[5] The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech [J].

Araki, S ;

Mukai, R ;

Makino, S ;

Nishikawa, T ;

Saruwatari, H .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (02) :109-116

[6] Combined approach of array processing and independent component analysis for blind separation of acoustic signals [J].

Asano, F ;

Ikeda, S ;

Ogawa, M ;

Asoh, H ;

Kitawaki, N .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (03) :204-215

[7]

Bertrand A., 2011, P IEEE S COMM VEH TE, P1, DOI [10.1109/SCVT.2011.6101302, DOI 10.1109/SCVT.2011.6101302]

[8] On the importance of early reflections for speech in rooms [J].

Bradley, JS ;

Sato, H ;

Picard, M .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2003, 113 (06) :3233-3244

[9]

Douglas SC, 2007, INT CONF ACOUST SPEE, P637

[10] Spatio-temporal FastICA algorithms for the blind separation of convolutive mixtures [J].

Douglas, Scott C. ;

Gupta, Malay ;

Sawada, Hiroshi ;

Makino, Shoji .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05) :1511-1520

← 1 2 3 4 5 →