Co-Inference Discriminative Tracking Through Multi-Task Siamese Network

被引：0

作者：

Chen, Yan ^{[1
,2
]}

Du, Jixiang ^{[1
,2
]}

Zhong, Bineng ^{[1
,2
]}

机构：

[1] Huaqiao Univ, Sch Comp Sci & Technol, Xiamen 361021, Peoples R China

[2] Huaqiao Univ, Fujian Key Lab Big Data Intelligence & Secur, Xiamen 361021, Peoples R China

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Target tracking; Training; Adaptation models; Detectors; Visualization; Task analysis; Training data; Visual tracking; multi-task learning; siamese network; residual learning; co-inference; OBJECT TRACKING; MULTIPLE;

D O I：

10.1109/ACCESS.2020.3045036

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In essence, visual tracking is a matching problem without any prior information about a class-agnostic object. By leveraging large scale off-line training data, recent trackers based on Siamese networks usually expect to pre-learn underlying similarity functions before a tracking task even begins. Consequently, they lack discriminative and adaptive powers. To address the issues, we propose a multi-stage co-inference tracker (named MSCI) via a multi-task Siamese network, in which a complicated tracking task is divided into three complementary sub-tasks (i.e., classification, regression and detection). Firstly, we design a novel multi-task loss function to end-to-end train the multi-task Siamese network via jointly learning from three sub-tasks. The multi-task Siamese network contains three parallel yet collaborative output layers, which correspond to three key components of our tracker (i.e., classifier, regressor and residual learning based detector). By sharing representations within the components, we not only improve each component's generalization performance, but also enhance our tracker's discriminative power. Then, we design a co-inference approach to effectively fuse the complementary components. As a result, our tracker can avoid the pitfalls of purely single components and get reliably observations to improve its adaptive power. Comprehensive experiments on OTB2013, OTB2015 and VOT2016 validate the effectiveness and robustness of our MSCI tracker.

引用

页码：60577 / 60587

页数：11

共 60 条

[1]

Andriluka M., 2008, PROC IEEE C COMPUT V, P1

[2]

[Anonymous], 2006, 2006 IEEE COMP SOC C, DOI DOI 10.1109/CVPR.2006.215

[3] Support vector tracking [J].

Avidan, S .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (08) :1064-1072

[4] Staple: Complementary Learners for Real-Time Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Golodetz, Stuart ;

Miksik, Ondrej ;

Torr, Philip H. S. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409

[5] Fully-Convolutional Siamese Networks for Object Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Henriques, Joao F. ;

Vedaldi, Andrea ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865

[6] Convolutional Regression for Visual Tracking [J].

Chen, Kai ;

Tao, Wenbing .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (07) :3611-3620

[7]

Chen Z., 2015, ARXIV150905520, V53, P68

[8] A General Framework for Tracking Multiple People from a Moving Camera [J].

Choi, Wongun ;

Pantofaru, Caroline ;

Savarese, Silvio .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (07) :1577-1591

[9]

Danelljan M., 2014, P 2014 BRIT MACH VIS, V65, P1

[10] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [J].

Danelljan, Martin ;

Robinson, Andreas ;

Khan, Fahad Shahbaz ;

Felsberg, Michael .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :472-488

← 1 2 3 4 5 6 →