MLVSNet: Multi-level Voting Siamese Network for 3D Visual Tracking

被引:33
作者
Wang, Zhoutao [1 ]
Xie, Qian [1 ]
Lai, Yu-Kun [2 ]
Wu, Jing [2 ]
Long, Kun [1 ]
Wang, Jun [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Nanjing, Peoples R China
[2] Cardiff Univ, Cardiff, S Glam, Wales
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV48922.2021.00309
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Benefiting from the excellent performance of Siamese-based trackers, huge progress on 2D visual tracking has been achieved. However, 3D visual tracking is still under-explored. Inspired by the idea of Hough voting in 3D object detection, in this paper, we propose a Multi-level Voting Siamese Network (MLVSNet) for 3D visual tracking from outdoor point cloud sequences. To deal with sparsity in outdoor 3D point clouds, we propose to perform Hough voting on multi-level features to get more vote centers and retain more useful information, instead of voting only on the final level feature as in previous methods. We also design an efficient and lightweight Target-Guided Attention (TGA) module to transfer the target information and highlight the target points in the search area. Moreover, we propose a Vote-cluster Feature Enhancement (VFE) module to exploit the relationships between different vote clusters. Extensive experiments on the 3D tracking benchmark of KITTI dataset demonstrate that our MLVSNet outperforms state-of-the-art methods with significant margins. Code will be available at https://github.com/CodeWZT/MLVSNet.
引用
收藏
页码:3081 / 3090
页数:10
相关论文
共 48 条
  • [1] Asvadi A, 2016, 2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), P1255, DOI 10.1109/ITSC.2016.7795718
  • [2] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [3] 3D Part-Based Sparse Tracker with Automatic Synchronization and Registration
    Bibi, Adel
    Zhang, Tianzhu
    Ghanem, Bernard
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1439 - 1448
  • [4] Multi attention module for visual tracking
    Chen, Boyu
    Li, Peixia
    Sun, Chong
    Wang, Dong
    Yang, Gang
    Lu, Huchuan
    [J]. PATTERN RECOGNITION, 2019, 87 : 80 - 93
  • [5] A Hierarchical Graph Network for 3D Object Detection on Point Clouds
    Chen, Jintai
    Lei, Biwen
    Song, Qingyu
    Ying, Haochao
    Chen, Danny Z.
    Wu, Jian
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 389 - 398
  • [6] Chen X., 2020, PROC IEEE C COMPUT V, P10176
  • [7] Visual Tracking Using Attention-Modulated Disintegration and Integration
    Choi, Jongwon
    Chang, Hyung Jin
    Jeong, Jiyeoup
    Demiris, Yiannis
    Choi, Jin Young
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4321 - 4330
  • [8] Deep learning in video multi-object tracking: A survey
    Ciaparrone, Gioele
    Luque Sanchez, Francisco
    Tabik, Siham
    Troiano, Luigi
    Tagliaferri, Roberto
    Herrera, Francisco
    [J]. NEUROCOMPUTING, 2020, 381 : 61 - 88
  • [9] Triplet Loss in Siamese Network for Object Tracking
    Dong, Xingping
    Shen, Jianbing
    [J]. COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 : 472 - 488
  • [10] Correlation-Guided Attention for Corner Detection Based Visual Tracking
    Du, Fei
    Liu, Peng
    Zhao, Wei
    Tang, Xianglong
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6835 - 6844