PVT plus plus : A Simple End-to-End Latency-Aware Visual Tracking Framework

被引：1

作者：

Li, Bowen ^{[1
]}

Huang, Ziyuan ^{[2
]}

Ye, Junjie ^{[3
]}

Li, Yiming ^{[4
]}

Scherer, Sebastian ^{[1
]}

Zhao, Hang ^{[5
]}

Fu, Changhong ^{[3
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA USA

[2] Natl Univ Singapore, Singapore, Singapore

[3] Tongji Univ, Shanghai, Peoples R China

[4] NYU, New York, NY USA

[5] Tsinghua Univ, Beijing, Peoples R China

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.00918

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual object tracking is essential to intelligent robots. Most existing approaches have ignored the online latency that can cause severe performance degradation during realworld processing. Especially for unmanned aerial vehicles ( UAVs), where robust tracking is more challenging and onboard computation is limited, the latency issue can be fatal. In this work, we present a simple framework for end-to-end latency-aware tracking, i.e., end-to-end predictive visual tracking (PVT++). Unlike existing solutions that naively append Kalman Filters after trackers, PVT++ can be jointly optimized, so that it takes not only motion information but can also leverage the rich visual knowledge in most pretrained tracker models for robust prediction. Besides, to bridge the training-evaluation domain gap, we propose a relative motion factor, empowering PVT++ to generalize to the challenging and complex UAV tracking scenes. These careful designs have made the small-capacity lightweight PVT++ a widely effective solution. Additionally, this work presents an extended latency-aware evaluation benchmark for assessing an any-speed tracker in the online setting. Empirical results on a robotic platform from the aerial perspective show that PVT++ can achieve significant performance gain on various trackers and exhibit higher accuracy than prior solutions, largely mitigating the degradation brought by latency. Our code is public at https: //github.com/Jaraxxus-Me/PVT_pp.git.

引用

页码：9972 / 9982

页数：11

共 50 条

[21] A min-plus calculus for end-to-end statistical service guarantees
Burchard, Almut
Liebeherr, Jorg
Patek, Stephen D.
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (09) : 4105 - 4114
[22] An End-to-End Differentiable Framework for Contact-Aware Robot Design
Xu, Jie
Chen, Tao
Zlokapa, Lara
Foshey, Michael
Matusik, Wojciech
Sueda, Shinjiro
Agrawal, Pulkit
ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
[23] DNN plus NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies
Peng, Xiaochen
Huang, Shanshi
Luo, Yandong
Sun, Xiaoyu
Yu, Shimeng
2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2019,
[24] End-to-end feature fusion Siamese network for adaptive visual tracking
Guo, Dongyan
Wang, Jun
Zhao, Weixuan
Cui, Ying
Wang, Zhenhua
Chen, Shengyong
IET IMAGE PROCESSING, 2021, 15 (01) : 91 - 100
[25] SPCNet: Scale Position Correlation Network for End-to-End Visual Tracking
Wang, Qiang
Gao, Jin
Zhang, Mengdan
Xing, Junliang
Hu, Weiming
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1803 - 1808
[26] Improve Visual Tracking by End-to-end Multi-Tracker Selection
Zheng, Tianqi
Xie, Chao
Zhou, Wengang
Li, Houqiang
8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 242 - 245
[27] Microtubule "plus-end-tracking proteins": The end is just the beginning
Schuyler, SC
Pellman, D
CELL, 2001, 105 (04) : 421 - 424
[28] NAM plus : TOWARDS SCALABLE END-TO-END CONTEXTUAL BIASING FOR ADAPTIVE ASR
Munkhdalai, Tsendsuren
Wu, Zelin
Pundak, Golan
Sim, Khe Chai
Li, Jiayang
Rondon, Pat
Sainath, Tara N.
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 190 - 196
[29] An end-to-end framework for context-aware business process outsourcing to the cloud
Rekik, Mouna
Boukadi, Khouloud
Ben-Abdallah, Hanene
COMPUTERS & ELECTRICAL ENGINEERING, 2017, 63 : 308 - 319
[30] CLAYRS: An end-to-end framework for reproducible knowledge-aware recommender systems
Lops, Pasquale
Polignano, Marco
Musto, Cataldo
Silletti, Antonio
Semeraro, Giovanni
INFORMATION SYSTEMS, 2023, 119

← 1 2 3 4 5 →