MLVSNet: Multi-level Voting Siamese Network for 3D Visual Tracking

被引：33

作者：

Wang, Zhoutao ^{[1
]}

Xie, Qian ^{[1
]}

Lai, Yu-Kun ^{[2
]}

Wu, Jing ^{[2
]}

Long, Kun ^{[1
]}

Wang, Jun ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Nanjing, Peoples R China

[2] Cardiff Univ, Cardiff, S Glam, Wales

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICCV48922.2021.00309

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Benefiting from the excellent performance of Siamese-based trackers, huge progress on 2D visual tracking has been achieved. However, 3D visual tracking is still under-explored. Inspired by the idea of Hough voting in 3D object detection, in this paper, we propose a Multi-level Voting Siamese Network (MLVSNet) for 3D visual tracking from outdoor point cloud sequences. To deal with sparsity in outdoor 3D point clouds, we propose to perform Hough voting on multi-level features to get more vote centers and retain more useful information, instead of voting only on the final level feature as in previous methods. We also design an efficient and lightweight Target-Guided Attention (TGA) module to transfer the target information and highlight the target points in the search area. Moreover, we propose a Vote-cluster Feature Enhancement (VFE) module to exploit the relationships between different vote clusters. Extensive experiments on the 3D tracking benchmark of KITTI dataset demonstrate that our MLVSNet outperforms state-of-the-art methods with significant margins. Code will be available at https://github.com/CodeWZT/MLVSNet.

引用

页码：3081 / 3090

页数：10

共 48 条

[41] Vote-Based 3D Object Detection with Context Modeling and SOB-3DNMS
Xie, Qian
Lai, Yu-Kun
Wu, Jing
Wang, Zhoutao
Zhang, Yiming
Xu, Kai
Wang, Jun
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1857 - 1874
[42] Object tracking: A survey
Yilmaz, Alper
Javed, Omar
Shah, Mubarak
[J]. ACM COMPUTING SURVEYS, 2006, 38 (04)
[43] Deformable Siamese Attention Networks for Visual Object Tracking
Yu, Yuechen
Xiong, Yilei
Huang, Weilin
Scott, Matthew R.
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6727 - 6736
[44] Zarzar Jesus, 2019, ARXIV190310168
[45] 3DSSD: Point-based 3D Single Stage Object Detector
Yang, Zetong
Sun, Yanan
Liu, Shu
Jia, Jiaya
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 11037 - 11045
[46] Spatial and semantic convolutional features for robust visual object tracking
Zhang, Jianming
Jin, Xiaokang
Sun, Juan
Wang, Jin
Sangaiah, Arun Kumar
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 15095 - 15115
[47] Deeper and Wider Siamese Networks for Real-Time Visual Tracking
Zhang, Zhipeng
Peng, Houwen
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4586 - 4595
[48] Zhu Zikun, 2019, 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia), P110, DOI 10.1109/ISGT-Asia.2019.8881380

← 1 2 3 4 5 →