Leveraging temporal-aware fine-grained features for robust multiple object tracking

被引:6
作者
Wu, Han [1 ]
Nie, Jiahao [1 ]
Zhu, Ziming [1 ]
He, Zhiwei [1 ,2 ]
Gao, Mingyu [1 ,2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Elect Informat, Hangzhou 310018, Zhejiang, Peoples R China
[2] Zhejiang Prov Key Lab Equipment Elect, 2019E10009, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multiple object tracking; Tracking-by-detection; Critical feature capturing; Temporal-aware feature aggregation;
D O I
10.1007/s11227-022-04776-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Existing multi-object trackers mainly apply the tracking-by-detection (TBD) paradigm and have achieved remarkable success. However, the mainstream methods execute their detection networks alone, without taking full advantage of the information derived from tracking so that the detection and tracking processes can benefit from each other. In this paper, we achieve strengthened tracking performance in complex scenarios by utilizing the rich temporal information derived from the tracking process to enhance the critical features at the current moment. Specifically, we first propose a critical feature capturing network (CFCN) for extracting receptive field adaptive discriminative features for each frame. Then, we design a temporal-aware feature aggregation module (TFAM), which is used to propagate previous critical features, thus leveraging temporal information to alleviate the detection quality degradation encountered when the visual cues decrease. Extensive experimental comparisons and analyses demonstrate the superiority and effectiveness of the proposed method on the popular and challenging MOT16, MOT17, and MOT20 benchmarks. The experimental results reveal that our tracker achieves state-of-the-art tracking performance, e.g., IDF1 of 75.2% on IDF and MOTA of 80.4% on MOT17.
引用
收藏
页码:2910 / 2931
页数:22
相关论文
共 68 条
  • [1] [Anonymous], 2017, P IEEE C COMPUTER VI, DOI DOI 10.1109/CVPR.2017.357
  • [2] Bernardin K, 2016, EURASIP J IMAGE VIDE, P17
  • [3] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [4] Bewley A, 2016, IEEE IMAGE PROC, P3464, DOI 10.1109/ICIP.2016.7533003
  • [5] Factors Influencing Pediatric Emergency Department Visits for Low-Acuity Conditions
    Long, Christina M.
    Mehrhoff, Casey
    Abdel-Latief, Eman
    Rech, Megan
    Laubham, Matthew
    [J]. PEDIATRIC EMERGENCY CARE, 2021, 37 (05) : 265 - 268
  • [6] Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
    Chu, Qi
    Ouyang, Wanli
    Li, Hongsheng
    Wang, Xiaogang
    Liu, Bin
    Yu, Nenghai
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4846 - 4855
  • [7] Deep learning in video multi-object tracking: A survey
    Ciaparrone, Gioele
    Luque Sanchez, Francisco
    Tabik, Siham
    Troiano, Luigi
    Tagliaferri, Roberto
    Herrera, Francisco
    [J]. NEUROCOMPUTING, 2020, 381 : 61 - 88
  • [8] Dendorfer P., 2020, arXiv
  • [9] MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking
    Dendorfer, Patrick
    Osep, Aljosa
    Milan, Anton
    Schindler, Konrad
    Cremers, Daniel
    Reid, Ian
    Roth, Stefan
    Leal-Taixe, Laura
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (04) : 845 - 881
  • [10] Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631