Forward-backward recursive expectation-maximization for concurrent speaker tracking

被引:1
作者
Dorfan, Yuval [1 ]
Schwartz, Boaz [1 ]
Gannot, Sharon [1 ]
机构
[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel
关键词
Sound source tracking; Recursive expectation-maximization; Microphone arrays; Simultaneous speakers; W-disjoint orthogonality; Forward-backward; MAXIMUM-LIKELIHOOD; ALGORITHM; EM; LOCALIZATION; CONVERGENCE; LOCATION;
D O I
10.1186/s13636-020-00189-x
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a study addressing the task of tracking multiple concurrent speakers in reverberant conditions is presented. Since both past and future observations can contribute to the current location estimate, we propose a forward-backward approach, which improves tracking accuracy by introducing near-future data to the estimator, in the cost of an additional short latency. Unlike classical target tracking, we apply a non-Bayesian approach, which does not make assumptions with respect to the target trajectories, except for assuming a realistic change in the parameters due to natural behaviour. The proposed method is based on the recursive expectation-maximization (REM) approach. The new method is dubbed forward-backward recursive expectation-maximization (FB-REM). The performance is demonstrated using an experimental study, where the tested scenarios involve both simulated and recorded signals, with typical reverberation levels and multiple moving sources. It is shown that the proposed algorithm outperforms the regular common causal (REM).
引用
收藏
页数:13
相关论文
共 64 条
[1]   IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].
ALLEN, JB ;
BERKLEY, DA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950
[2]  
[Anonymous], 2016, 2016 IEEE INT WORKSH
[3]  
[Anonymous], 2012, ADAPTIVE ALGORITHMS
[4]  
Azimi-Sadjadi M., 2006, DEF SEC S PROP RAND
[5]  
Bar-Shalom Y, 1990, Multitarget-multisensor tracking: Advanced applications
[6]  
Blackman S.S., 1986, Multiple target tracking with radar applications
[7]  
Brandstein MS, 1997, INT CONF ACOUST SPEE, P375, DOI 10.1109/ICASSP.1997.599651
[8]  
Brutti A., 2012, INT WORKSH AC SIGN E, P1
[9]   Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs [J].
Brutti, Alessio ;
Nesta, Francesco .
COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03) :660-682
[10]  
Caljon T., 2005, INT C ADV CONC INT V, P587, DOI DOI 10.1007/11558484_74