An audio-visual particle filter for speaker tracking on the CLEAR'06 evaluation dataset

被引:0
|
作者
Nickel, Kai [1 ]
Gehrig, Tobias [1 ]
Ekenel, Hazim K. [1 ]
McDonough, John [1 ]
Stiefelhagen, Rainer [1 ]
机构
[1] Univ Karlsruhe, Interact Sys Labs, D-76131 Karlsruhe, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter framework. The filter performs sampled projections of 3D location hypotheses and scores them using features from both audio and video. On the video side, the features are based on foreground segmentation, multi-view face detection and upper body detection. On the audio side, the time delays of arrival between pairs of microphones are estimated with a generalized cross correlation function. In the CLEAR'06 evaluation, the system yielded a tracking accuracy (MOTA) of 71% for video-only, 55% for audio-only and 90% for combined audio-visual tracking.
引用
收藏
页码:69 / 80
页数:12
相关论文
共 50 条
  • [1] 3D Audio-Visual Speaker Tracking with A Novel Particle Filter
    Liu, Hong
    Sun, Yongheng
    Li, Yidi
    Yang, Bing
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7343 - 7348
  • [2] 3D AUDIO-VISUAL SPEAKER TRACKING WITH AN ADAPTIVE PARTICLE FILTER
    Qian, Xinyuan
    Brutti, Alessio
    Omologo, Maurizio
    Cavallaro, Andrea
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2896 - 2900
  • [3] Audio-visual speaker tracking with importance particle filters
    Gatica-Perez, D
    Lathoud, G
    McCowan, I
    Odobez, JM
    Moore, D
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 25 - 28
  • [4] Particle Flow SMC-PHD Filter for Audio-Visual Multi-speaker Tracking
    Liu, Yang
    Wang, Wenwu
    Chambers, Jonathon
    Kilic, Volkan
    Hilton, Adrian
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2017), 2017, 10169 : 344 - 353
  • [5] 3D AUDIO-VISUAL SPEAKER TRACKING WITH A TWO-LAYER PARTICLE FILTER
    Liu, Hong
    Li, Yidi
    Yang, Bing
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1955 - 1959
  • [6] AVA ACTIVE SPEAKER: AN AUDIO-VISUAL DATASET FOR ACTIVE SPEAKER DETECTION
    Roth, Joseph
    Chaudhuri, Sourish
    Klejch, Ondrej
    Marvin, Radhika
    Gallagher, Andrew
    Kaver, Liat
    Ramaswamy, Sharadh
    Stopczynski, Arkadiusz
    Schmid, Cordelia
    Xi, Zhonghua
    Pantofaru, Caroline
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4492 - 4496
  • [7] Particle Filtering for Bearing-Only Audio-Visual Speaker Detection and Tracking
    Rae, Andrew
    Khamis, Alaa
    Basir, Otman
    Kamel, Mohamed
    2009 3RD INTERNATIONAL CONFERENCE ON SIGNALS, CIRCUITS AND SYSTEMS (SCS 2009), 2009, : 161 - +
  • [8] Multilingual Audio-Visual Smartphone Dataset and Evaluation
    Mandalapu, Hareesh
    Reddy, P. N. Aravinda
    Ramachandra, Raghavendra
    Rao, Krothapalli Sreenivasa
    Mitra, Pabitra
    Prasanna, S. R. Mahadeva
    Busch, Christoph
    IEEE ACCESS, 2021, 9 : 153240 - 153257
  • [9] Speaker Tracking Based on Audio-Visual Fusion with Unknown Noise
    Cao, Jie
    Li, Jun
    Li, Wei
    PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2013, 256 : 215 - 226
  • [10] Audio-visual active speaker tracking in cluttered indoors environments
    Talantzis, Fotios
    Pnevmatikakis, Aristodemos
    Constantinides, Anthony G.
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (03): : 799 - 807