End-to-end learning interpolation for object tracking in low frame-rate video

被引:7
|
作者
Liu, Liqiang [1 ,2 ]
Cao, Jianzhong [1 ]
机构
[1] Chinese Acad Sci, Xian Inst Opt & Precis Mech, 17 Xinxi Rd, Xian, Peoples R China
[2] Univ Chinese Acad Sci, 19 Yuquan Rd, Beijing, Peoples R China
关键词
video signal processing; learning (artificial intelligence); object tracking; interpolation; mobile computing; low frame rates; implicit video frame interpolation sub-network; low frame-rate video; high frame-rate latent video; effective end-to-end optimisation; frame rate; tracking accuracy; semantic video analytics; end-to-end learning interpolation; subsequent semantic analytics; bandwidth constraints; analytics performance; MOTION ESTIMATION; SIAMESE NETWORKS;
D O I
10.1049/iet-ipr.2019.0944
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many scenarios, where videos are transmitted through bandwidth-limited channels for subsequent semantic analytics, the choice of frame rates has to balance between bandwidth constraints and analytics performance. Faced with this practical challenge, this study focuses on enhancing object tracking at low frame rates and proposes a learning Interpolation for tracking framework. This framework embeds an implicit video frame interpolation sub-network, which is concatenated and jointly trained with another object tracking sub-network. Once a low frame-rate video is an input, it is first mapped into a high frame-rate latent video, based on which the tracker is learned. Novel strategies and loss functions are derived to ensure the effective end-to-end optimisation of the authors' network. On several challenging benchmarks and settings, their method achieves a highly competitive tradeoff between frame rate and tracking accuracy. As is known, the implications of interpolation on semantic video analytics and tracking remain unexplored, and the authors expect their method to find many applications in mobile embedded vision, Internet of Things and edge computing.
引用
收藏
页码:1066 / 1072
页数:7
相关论文
共 50 条
  • [21] End-to-End Video Captioning with Multitask Reinforcement Learning
    Li, Lijun
    Gong, Boqing
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 339 - 348
  • [22] Provenance Tracking for End-to-End Machine Learning Pipelines
    Grafberger, Stefan
    Groth, Paul
    Schelter, Sebastian
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1512 - 1512
  • [23] Learning Diverse Models for End-to-End Ensemble Tracking
    Wang, Ning
    Zhou, Wengang
    Li, Houqiang
    IEEE Transactions on Image Processing, 2021, 30 : 2220 - 2231
  • [24] Learning Diverse Models for End-to-End Ensemble Tracking
    Wang, Ning
    Zhou, Wengang
    Li, Houqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2220 - 2231
  • [25] Modified Particle Filter for Object Tracking in Low Frame Rate Video
    Zhang Tao
    Fei Shu-min
    Wang Li-li
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 4936 - 4941
  • [26] Modified Particle Filter for Object Tracking in Low Frame Rate Video
    Zhang, Tao
    Fei, Shumin
    Lu, Hong
    Li, Xiaodong
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 2552 - 2557
  • [27] End-to-End Video Object Detection with Spatial-Temporal Transformers
    He, Lu
    Zhou, Qianyu
    Li, Xiangtai
    Niu, Li
    Cheng, Guangliang
    Li, Xiao
    Liu, Wenxuan
    Tong, Yunhai
    Ma, Lizhuang
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1507 - 1516
  • [28] Joint Detection and Association for End-to-End Multi-object Tracking
    Ye Li
    Xiaoyu Luo
    Junyu Shi
    Xinzhong Wang
    Guangqiang Yin
    Zhiguo Wang
    Neural Processing Letters, 2023, 55 : 11823 - 11844
  • [29] Joint Detection and Association for End-to-End Multi-object Tracking
    Li, Ye
    Luo, Xiaoyu
    Shi, Junyu
    Wang, Xinzhong
    Yin, Guangqiang
    Wang, Zhiguo
    NEURAL PROCESSING LETTERS, 2023, 55 (09) : 11823 - 11844
  • [30] Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
    Huang, Wenyong
    Hu, Wenchao
    Yeung, Yu Ting
    Chen, Xiao
    INTERSPEECH 2020, 2020, : 5001 - 5005