End-to-end learning interpolation for object tracking in low frame-rate video

被引:7
|
作者
Liu, Liqiang [1 ,2 ]
Cao, Jianzhong [1 ]
机构
[1] Chinese Acad Sci, Xian Inst Opt & Precis Mech, 17 Xinxi Rd, Xian, Peoples R China
[2] Univ Chinese Acad Sci, 19 Yuquan Rd, Beijing, Peoples R China
关键词
video signal processing; learning (artificial intelligence); object tracking; interpolation; mobile computing; low frame rates; implicit video frame interpolation sub-network; low frame-rate video; high frame-rate latent video; effective end-to-end optimisation; frame rate; tracking accuracy; semantic video analytics; end-to-end learning interpolation; subsequent semantic analytics; bandwidth constraints; analytics performance; MOTION ESTIMATION; SIAMESE NETWORKS;
D O I
10.1049/iet-ipr.2019.0944
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many scenarios, where videos are transmitted through bandwidth-limited channels for subsequent semantic analytics, the choice of frame rates has to balance between bandwidth constraints and analytics performance. Faced with this practical challenge, this study focuses on enhancing object tracking at low frame rates and proposes a learning Interpolation for tracking framework. This framework embeds an implicit video frame interpolation sub-network, which is concatenated and jointly trained with another object tracking sub-network. Once a low frame-rate video is an input, it is first mapped into a high frame-rate latent video, based on which the tracker is learned. Novel strategies and loss functions are derived to ensure the effective end-to-end optimisation of the authors' network. On several challenging benchmarks and settings, their method achieves a highly competitive tradeoff between frame rate and tracking accuracy. As is known, the implications of interpolation on semantic video analytics and tracking remain unexplored, and the authors expect their method to find many applications in mobile embedded vision, Internet of Things and edge computing.
引用
收藏
页码:1066 / 1072
页数:7
相关论文
共 50 条
  • [1] End-to-end frame-rate adaptive streaming of video data
    Fung, CW
    Liew, SC
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 67 - 71
  • [2] End-to-end Active Object Tracking via Reinforcement Learning
    Luo, Wenhan
    Sun, Peng
    Zhong, Fangwei
    Liu, Wei
    Zhang, Tong
    Wang, Yizhou
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [3] End-to-End Learning for Video Frame Compression with Self-Attention
    Zou, Nannan
    Zhang, Honglei
    Cricri, Francesco
    Tavakoli, Hamed R.
    Lainema, Jani
    Aksu, Emre
    Hannuksela, Miska
    Rahtu, Esa
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 580 - 584
  • [4] FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation
    Voigtlaender, Paul
    Chai, Yuning
    Schroff, Florian
    Adam, Hartwig
    Leibe, Bastian
    Chen, Liang-Chieh
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9473 - 9482
  • [5] Object tracking in low-frame-rate video
    Porikli, F
    Tuzel, O
    IMAGE AND VIDEO COMMUNICATIONS AND PROCESSING 2005, PTS 1 AND 2, 2005, 5685 : 72 - 79
  • [6] End-to-end active object tracking football game via reinforcement learning
    Qin, Haobin
    Liu, Ming
    Dong, Liquan
    Kong, Lingqin
    Hui, Mei
    Zhao, Yuejin
    OPTICAL METROLOGY AND INSPECTION FOR INDUSTRIAL APPLICATIONS IX, 2022, 12319
  • [7] Toward End-to-End Object Detection and Tracking on the Edge
    Tabkhi, Hamed
    SEC 2017: 2017 THE SECOND ACM/IEEE SYMPOSIUM ON EDGE COMPUTING (SEC'17), 2017,
  • [8] CSVideoNet: A Real-time End-to-end Learning Framework for High-frame-rate Video Compressive Sensing
    Xu, Kai
    Ren, Fengbo
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1680 - 1688
  • [9] An End-to-End Learning Framework for Video Compression
    Lu, Guo
    Zhang, Xiaoyun
    Ouyang, Wanli
    Chen, Li
    Gao, Zhiyong
    Xu, Dong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3292 - 3308
  • [10] End-to-end Learning Improves Static Object Geo-localization from Video
    Chaabane, Mohamed
    Gueguen, Lionel
    Trabelsi, Ameni
    Beveridge, Ross
    O'Hara, Stephen
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2062 - 2071