Deep Learning in Visual Tracking: A Review

被引:59
作者
Jiao, Licheng [1 ,2 ]
Wang, Dan [1 ,2 ]
Bai, Yidong [3 ,4 ]
Chen, Puhua [1 ,2 ]
Liu, Fang [1 ,2 ]
机构
[1] Xidian Univ, Int Res Ctr Intelligent Percept & Computat, Key Lab Intelligent Percept & Image Understanding, Minist Educ, Xian 710071, Peoples R China
[2] Xidian Univ, Sch Artificial Intelligence, Joint Int Res Lab Intelligent Percept & Computat, Xian 710071, Peoples R China
[3] Xidian Univ, Sch Artificial Intelligence, Key Lab Intelligent Percept & Image Understanding, Xian 710071, Peoples R China
[4] Waseda Univ, Intelligent Software Lab, Tokyo 1698555, Japan
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Visualization; Target tracking; Task analysis; Feature extraction; Deep learning; Trajectory; Nonhomogeneous media; Deep learning (DL); multiple-object tracking (MOT); single-object tracking (SOT); MULTIPLE OBJECT TRACKING; CORRELATION FILTERS; NEURAL-NETWORKS; ROBUST; MULTITARGET; SYSTEM;
D O I
10.1109/TNNLS.2021.3136907
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.
引用
收藏
页码:5497 / 5516
页数:20
相关论文
共 213 条
[1]  
[Anonymous], 2015, ARXIV150104587
[2]  
[Anonymous], 2014, ARXIV14097618
[3]  
[Anonymous], 2016, arXiv e-prints
[4]   Support vector tracking [J].
Avidan, S .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (08) :1064-1072
[5]   Confidence-Based Data Association and Discriminative Deep Appearance Learning for Robust Online Multi-Object Tracking [J].
Bae, Seung-Hwan ;
Yoon, Kuk-Jin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) :595-610
[6]  
Benfold B., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3457, DOI 10.1109/CVPR.2011.5995667
[7]   Tracking without bells and whistles [J].
Bergmann, Philipp ;
Meinhardt, Tim ;
Leal-Taixe, Laura .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :941-951
[8]   Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics [J].
Bernardin, Keni ;
Stiefelhagen, Rainer .
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
[9]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[10]  
Bhat G, 2020, Img Proc Comp Vis Re, V12368, P205, DOI 10.1007/978-3-030-58592-1_13