Distributed Expectation-Maximization Algorithm for Speaker Localization in Reverberant Environments

被引:20
作者
Dorfan, Yuval [1 ]
Plinge, Axel [2 ]
Hazan, Gershon [1 ]
Gannot, Sharon [1 ]
机构
[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel
[2] Tech Univ Dortmund, Dept Comp Sci, D-44227 Dortmund, Germany
关键词
Precedence effect; onset dominance; distributed expectation-maximization; auditory scene analysis; sound source localization; spectral masking; incremental expectation-maximization; truncated Gaussian; multi-path; time difference of arrival; OF-ARRIVAL ESTIMATION; ACOUSTIC SOURCE LOCALIZATION; SOUND SOURCE LOCALIZATION; EM ALGORITHM; SPEECH RECOGNITION; MICROPHONE ARRAY; TRACKING; NOISE; ROOM; MODEL;
D O I
10.1109/TASLP.2017.2788198
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Localization of acoustic sources has attracted a considerable amount of research attention in recent years. A major obstacle to achieving high localization accuracy is the presence of reverberation, the influence of which obviously increases with the number of active speakers in the room. Human hearing is capable of localizing acoustic sources even in extreme conditions. In this study, we propose to combine a method based on human hearing mechanisms and a modified incremental distributed expectation-maximization (IDEM) algorithm. Rather than using phase difference measurements that are modeled by a mixture of complex-valued Gaussians, as proposed in the original IDEM framework, we propose to use time difference of arrival measurements in multiple subbands and model them by a mixture of real-valued truncated Gaussians. Moreover, we propose to first filter the measurements in order to reduce the effect of the multipath conditions. The proposed method is evaluated using both simulated data and real-life recordings.
引用
收藏
页码:682 / 695
页数:14
相关论文
共 84 条
[1]   IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].
ALLEN, JB ;
BERKLEY, DA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950
[2]   Low complexity multiple acoustic source localization in sensor networks based on energy measurements [J].
Ampeliods, Dimitris ;
Berberidis, Kostas .
SIGNAL PROCESSING, 2010, 90 (04) :1300-1312
[3]  
Antonacci F, 2005, INT CONF ACOUST SPEE, P1061
[4]  
Bertrand A., 2011, P IEEE S COMM VEH TE, P1, DOI [10.1109/SCVT.2011.6101302, DOI 10.1109/SCVT.2011.6101302]
[5]  
Bishop C.M., 2006, PATTERN RECOGN, V4, P738, DOI DOI 10.1117/1.2819119
[6]   Energy-based sensor network source localization via projection onto convex sets [J].
Blatt, Doron ;
Hero, Alfred O., III .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (09) :3614-3619
[7]  
Brandstein M., 2001, MICROPHONE ARRAYS SI, V8
[8]   Localization of multiple speakers based on a two step acoustic map analysis [J].
Brutti, Alessio ;
Omologo, Maurizio ;
Svaizer, Piergiorgio .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4349-4352
[9]   Modeling the cochlear nucleus:: A site for monaural echo suppression? [J].
Buerck, Moritz ;
van Hernmen, J. Leo .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 122 (04) :2226-2235
[10]   Speech enhancement using a mixture-maximum model [J].
Burshtein, D ;
Gannot, S .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (06) :341-351