Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter

被引:1
|
作者
Li, Yidi [1 ]
Liu, Hong [1 ]
Yang, Bing [1 ]
Ding, Runwei [2 ]
Chen, Yang [3 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Shenzhen 518055, Peoples R China
[2] Chongqing Univ Technol, Sch Artificial Intelligence, Chongqing 401135, Peoples R China
[3] Yanka Kupala State Univ Grodno, Grodno, BELARUS
基金
中国国家自然科学基金;
关键词
LOCALIZATION;
D O I
10.1155/2020/3764309
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
For speaker tracking, integrating multimodal information from audio and video provides an effective and promising solution. The current challenges are focused on the construction of a stable observation model. To this end, we propose a 3D audio-visual speaker tracker assisted by deep metric learning on the two-layer particle filter framework. Firstly, the audio-guided motion model is applied to generate candidate samples in the hierarchical structure consisting of an audio layer and a visual layer. Then, a stable observation model is proposed with a designed Siamese network, which provides the similarity-based likelihood to calculate particle weights. The speaker position is estimated using an optimal particle set, which integrates the decisions from audio particles and visual particles. Finally, the long short-term mechanism-based template update strategy is adopted to prevent drift during tracking. Experimental results demonstrate that the proposed method outperforms the single-modal trackers and comparison methods. Efficient and robust tracking is achieved both in 3D space and on image plane.
引用
收藏
页数:8
相关论文
共 42 条
  • [31] Real time volumetric MRI for 3D motion tracking via geometry-informed deep learning
    Liu, Lianli
    Shen, Liyue
    Johansson, Adam
    Balter, James M.
    Cao, Yue
    Chang, Daniel
    Xing, Lei
    MEDICAL PHYSICS, 2022, 49 (09) : 6110 - 6119
  • [32] Adaptive mmWave Beam Tracking in 3D Scenes via Multi-Agent Deep Q Learning
    Lu, Yuncheng
    Meng, Fan
    Huang, Yongming
    Lu, Zhaohua
    Yu, Fei
    2022 14TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING, WCSP, 2022, : 719 - 724
  • [33] Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs
    David Concha
    Raúl Cabido
    Juan José Pantrigo
    Antonio S. Montemayor
    Journal of Real-Time Image Processing, 2018, 15 : 309 - 327
  • [34] Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs
    Concha, David
    Cabido, Raul
    Jose Pantrigo, Juan
    Montemayor, Antonio S.
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2018, 15 (02) : 309 - 327
  • [35] Real-Time 3D Tracking of Multi-Particle in the Wide-Field Illumination Based on Deep Learning
    Luo, Xiao
    Zhang, Jie
    Tan, Handong
    Jiang, Jiahao
    Li, Junda
    Wen, Weijia
    SENSORS, 2024, 24 (08)
  • [36] Real-time visual tracker by stream processing: Simultaneous and fast 3d tracking of multiple faces in video sequences by using a particle filter
    Mateo Lozano O.
    Otsuka K.
    Journal of Signal Processing Systems, 2009, 57 (02) : 285 - 295
  • [37] Deep learning-assisted identification and quantification of aneurysmal subarachnoid hemorrhage in non-contrast CT scans: Development and external validation of Hybrid 2D/3D UNet
    Hu, Ping
    Zhou, Haizhu
    Yan, Tengfeng
    Miu, Hongping
    Xiao, Feng
    Zhu, Xinyi
    Shu, Lei
    Yang, Shuang
    Jin, Ruiyun
    Dou, Wenlei
    Ren, Baoyu
    Zhu, Lizhen
    Liu, Wanrong
    Zhang, Yihan
    Zeng, Kaisheng
    Ye, Minhua
    Lv, Shigang
    Wu, Miaojing
    Deng, Gang
    Hu, Rong
    Zhan, Renya
    Chen, Qianxue
    Zhang, Dong
    Zhu, Xingen
    NEUROIMAGE, 2023, 279
  • [38] Deep-learning-enhanced model reconstruction of realistic 3D rock particles by intelligent video tracking of 2D random particle projections
    Wang, Xiang
    Zhang, Haoran
    Yin, Zhen-Yu
    Su, Dong
    Liu, Zhongqiang
    ACTA GEOTECHNICA, 2023, 18 (03) : 1407 - 1430
  • [39] Deep-learning-enhanced model reconstruction of realistic 3D rock particles by intelligent video tracking of 2D random particle projections
    Xiang Wang
    Haoran Zhang
    Zhen-Yu Yin
    Dong Su
    Zhongqiang Liu
    Acta Geotechnica, 2023, 18 : 1407 - 1430
  • [40] Deep Learning-Based Eye-Tracking Analysis for Diagnosis of Alzheimer's Disease Using 3D Comprehensive Visual Stimuli
    Zuo, Fangyu
    Jing, Peiguang
    Sun, Jinglin
    Duan, Jizhong
    Ji, Yong
    Liu, Yu
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (05) : 2781 - 2793