PVT plus plus : A Simple End-to-End Latency-Aware Visual Tracking Framework

被引:1
|
作者
Li, Bowen [1 ]
Huang, Ziyuan [2 ]
Ye, Junjie [3 ]
Li, Yiming [4 ]
Scherer, Sebastian [1 ]
Zhao, Hang [5 ]
Fu, Changhong [3 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA USA
[2] Natl Univ Singapore, Singapore, Singapore
[3] Tongji Univ, Shanghai, Peoples R China
[4] NYU, New York, NY USA
[5] Tsinghua Univ, Beijing, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
关键词
D O I
10.1109/ICCV51070.2023.00918
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual object tracking is essential to intelligent robots. Most existing approaches have ignored the online latency that can cause severe performance degradation during realworld processing. Especially for unmanned aerial vehicles ( UAVs), where robust tracking is more challenging and onboard computation is limited, the latency issue can be fatal. In this work, we present a simple framework for end-to-end latency-aware tracking, i.e., end-to-end predictive visual tracking (PVT++). Unlike existing solutions that naively append Kalman Filters after trackers, PVT++ can be jointly optimized, so that it takes not only motion information but can also leverage the rich visual knowledge in most pretrained tracker models for robust prediction. Besides, to bridge the training-evaluation domain gap, we propose a relative motion factor, empowering PVT++ to generalize to the challenging and complex UAV tracking scenes. These careful designs have made the small-capacity lightweight PVT++ a widely effective solution. Additionally, this work presents an extended latency-aware evaluation benchmark for assessing an any-speed tracker in the online setting. Empirical results on a robotic platform from the aerial perspective show that PVT++ can achieve significant performance gain on various trackers and exhibit higher accuracy than prior solutions, largely mitigating the degradation brought by latency. Our code is public at https: //github.com/Jaraxxus-Me/PVT_pp.git.
引用
收藏
页码:9972 / 9982
页数:11
相关论文
共 50 条
  • [21] A min-plus calculus for end-to-end statistical service guarantees
    Burchard, Almut
    Liebeherr, Jorg
    Patek, Stephen D.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (09) : 4105 - 4114
  • [22] An End-to-End Differentiable Framework for Contact-Aware Robot Design
    Xu, Jie
    Chen, Tao
    Zlokapa, Lara
    Foshey, Michael
    Matusik, Wojciech
    Sueda, Shinjiro
    Agrawal, Pulkit
    ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [23] DNN plus NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies
    Peng, Xiaochen
    Huang, Shanshi
    Luo, Yandong
    Sun, Xiaoyu
    Yu, Shimeng
    2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2019,
  • [24] End-to-end feature fusion Siamese network for adaptive visual tracking
    Guo, Dongyan
    Wang, Jun
    Zhao, Weixuan
    Cui, Ying
    Wang, Zhenhua
    Chen, Shengyong
    IET IMAGE PROCESSING, 2021, 15 (01) : 91 - 100
  • [25] SPCNet: Scale Position Correlation Network for End-to-End Visual Tracking
    Wang, Qiang
    Gao, Jin
    Zhang, Mengdan
    Xing, Junliang
    Hu, Weiming
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1803 - 1808
  • [26] Improve Visual Tracking by End-to-end Multi-Tracker Selection
    Zheng, Tianqi
    Xie, Chao
    Zhou, Wengang
    Li, Houqiang
    8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 242 - 245
  • [27] Microtubule "plus-end-tracking proteins": The end is just the beginning
    Schuyler, SC
    Pellman, D
    CELL, 2001, 105 (04) : 421 - 424
  • [28] NAM plus : TOWARDS SCALABLE END-TO-END CONTEXTUAL BIASING FOR ADAPTIVE ASR
    Munkhdalai, Tsendsuren
    Wu, Zelin
    Pundak, Golan
    Sim, Khe Chai
    Li, Jiayang
    Rondon, Pat
    Sainath, Tara N.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 190 - 196
  • [29] An end-to-end framework for context-aware business process outsourcing to the cloud
    Rekik, Mouna
    Boukadi, Khouloud
    Ben-Abdallah, Hanene
    COMPUTERS & ELECTRICAL ENGINEERING, 2017, 63 : 308 - 319
  • [30] CLAYRS: An end-to-end framework for reproducible knowledge-aware recommender systems
    Lops, Pasquale
    Polignano, Marco
    Musto, Cataldo
    Silletti, Antonio
    Semeraro, Giovanni
    INFORMATION SYSTEMS, 2023, 119