Forward-backward recursive expectation-maximization for concurrent speaker tracking

被引:0
作者
Yuval Dorfan
Boaz Schwartz
Sharon Gannot
机构
[1] Faculty of Engineering,
[2] Bar-Ilan University,undefined
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2021卷
关键词
Sound source tracking; Recursive expectation-maximization; Microphone arrays; Simultaneous speakers; W-disjoint orthogonality; Forward-backward;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a study addressing the task of tracking multiple concurrent speakers in reverberant conditions is presented. Since both past and future observations can contribute to the current location estimate, we propose a forward-backward approach, which improves tracking accuracy by introducing near-future data to the estimator, in the cost of an additional short latency. Unlike classical target tracking, we apply a non-Bayesian approach, which does not make assumptions with respect to the target trajectories, except for assuming a realistic change in the parameters due to natural behaviour. The proposed method is based on the recursive expectation-maximization (REM) approach. The new method is dubbed forward-backward recursive expectation-maximization (FB-REM). The performance is demonstrated using an experimental study, where the tested scenarios involve both simulated and recorded signals, with typical reverberation levels and multiple moving sources. It is shown that the proposed algorithm outperforms the regular common causal (REM).
引用
收藏
相关论文
共 64 条
  • [1] Liggins M. E.(1997)Distributed fusion architectures and algorithms for target tracking Proc. IEEE 85 95-107
  • [2] Chong C. -Y.(1986)Multiple emitter location and signal parameter estimation IEEE Trans. Antennas Propag. 34 276-280
  • [3] Kadar I.(2017)Distributed expectation-maximization algorithm for speaker localization in reverberant environments IEEE/ACM Trans Audio Speech Lang. Process 26 682-695
  • [4] Alford M. G.(2018)Towards end-to-end acoustic localization using deep learning: from audio signals to source position coordinates Sensors 18 3418-1196
  • [5] Vannicola V.(2010)Speaker localization and tracking with a microphone array on a mobile robot using von Mises distribution and particle filtering Robot. Auton Syst. 58 1185-2830
  • [6] Thomopoulos S.(2013)Distributed multiple-model estimation for simultaneous localization and tracking with NLOS mitigation IEEE Trans. Veh. Technol. 62 2824-682
  • [7] Schmidt R.(2013)Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs Comput. Speech Lang. 27 660-271
  • [8] Dorfan Y.(2004)Dynamic clustering for acoustic target tracking in wireless sensor networks IEEE Trans. Mob. Comput. 3 258-807
  • [9] Plinge A.(2008)Audio–visual active speaker tracking in cluttered indoors environments IEEE Trans. Syst. Man Cybern. Part B Cybern. 38 799-1555
  • [10] Hazan G.(2011)Multiple-hypothesis extended particle filter for acoustic source localization in reverberant environments IEEE Trans. Audio Speech Lang. Process 19 1540-600