3D AUDIO-VISUAL SPEAKER TRACKING WITH A TWO-LAYER PARTICLE FILTER

被引:0
|
作者
Liu, Hong [1 ]
Li, Yidi [1 ]
Yang, Bing [1 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Beijing, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年
基金
中国国家自然科学基金;
关键词
3D speaker tracking; audio-visual fusion; particle filter; adaptive likelihood; LOCALIZATION;
D O I
10.1109/icip.2019.8803117
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Audio-visual speaker tracking in 3D space is a challenging problem. Although the classical particle filter based methods have shown effectiveness in audio-visual speaker tracking, the performance degrades considerably when the measurements are disturbed by noise. To this end, a novel two-layer particle filter is proposed for 3D audio-visual speaker tracking. Firstly, two groups of particles, which are generated from the audio and video streams respectively, are propagated independently in the audio layer and visual layer. Then, the audio and visual likelihoods are combined in an adaptive sigmoid function, which can adjust particle weights according to the confidence of two modalities. Finally, an optimal particle set selected from two groups of particles is proposed to determine the speaker position and reset the particle positions in the next frame. Experiments on AV16.3 database show that our method outperforms the trackers using individual modalities and the existing approaches in the 3D space and on the image plane.
引用
收藏
页码:1955 / 1959
页数:5
相关论文
共 50 条
  • [1] Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter
    Li, Yidi
    Liu, Hong
    Yang, Bing
    Ding, Runwei
    Chen, Yang
    COMPLEXITY, 2020, 2020
  • [2] 3D Audio-Visual Speaker Tracking with A Novel Particle Filter
    Liu, Hong
    Sun, Yongheng
    Li, Yidi
    Yang, Bing
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7343 - 7348
  • [3] 3D AUDIO-VISUAL SPEAKER TRACKING WITH AN ADAPTIVE PARTICLE FILTER
    Qian, Xinyuan
    Brutti, Alessio
    Omologo, Maurizio
    Cavallaro, Andrea
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2896 - 2900
  • [4] An audio-visual particle filter for speaker tracking on the CLEAR'06 evaluation dataset
    Nickel, Kai
    Gehrig, Tobias
    Ekenel, Hazim K.
    McDonough, John
    Stiefelhagen, Rainer
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2007, 4122 : 69 - 80
  • [5] Audio-visual speaker tracking with importance particle filters
    Gatica-Perez, D
    Lathoud, G
    McCowan, I
    Odobez, JM
    Moore, D
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 25 - 28
  • [6] Audio-Visual Clustering for 3D Speaker Localization
    Khalidov, Vasil
    Forbes, Florence
    Hansard, Miles
    Arnaud, Elise
    Horaud, Radu
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 86 - 97
  • [7] Particle Flow SMC-PHD Filter for Audio-Visual Multi-speaker Tracking
    Liu, Yang
    Wang, Wenwu
    Chambers, Jonathon
    Kilic, Volkan
    Hilton, Adrian
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2017), 2017, 10169 : 344 - 353
  • [8] Particle Filtering for Bearing-Only Audio-Visual Speaker Detection and Tracking
    Rae, Andrew
    Khamis, Alaa
    Basir, Otman
    Kamel, Mohamed
    2009 3RD INTERNATIONAL CONFERENCE ON SIGNALS, CIRCUITS AND SYSTEMS (SCS 2009), 2009, : 161 - +
  • [9] Speaker Tracking Based on Audio-Visual Fusion with Unknown Noise
    Cao, Jie
    Li, Jun
    Li, Wei
    PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2013, 256 : 215 - 226
  • [10] Audio-visual active speaker tracking in cluttered indoors environments
    Talantzis, Fotios
    Pnevmatikakis, Aristodemos
    Constantinides, Anthony G.
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (03): : 799 - 807