Adversarial Feature Sampling Learning for Efficient Visual Tracking

被引：9

作者：

Yin, Yingjie ^{[1
,2
,3
]}

Xu, De ^{[1
,3
]}

Wang, Xingang ^{[1
,3
]}

Zhang, Lei ^{[2
]}

机构：

[1] Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China

[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2020年 / 17卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Adversarial learning; deep convolution neural network; feature sampling; visual tracking; OBJECT TRACKING;

D O I：

10.1109/TASE.2019.2948402

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The tracking-by-detection tracking framework usually consists of two stages: drawing samples around the target object and classifying each sample as either the target object or background. Current popular trackers under this framework typically draw many samples from the raw image and feed them into the deep neural networks, resulting in high computational burden and low tracking speed. In this article, we propose an adversarial feature sampling learning (AFSL) method to address this problem. A convolutional neural network is designed, which takes only one cropped image around the target object as input, and samples are collected from the feature maps with spatial bilinear resampling. To enrich the appearance variations of positive samples in the feature space, which has limited spatial resolution, we fuse the high-level features and low-level features to better describe the target by using a generative adversarial network. Extensive experiments on benchmark data sets demonstrate that the proposed ASFL achieves leading tracking accuracy while significantly accelerating the speed of tracking-by-detection trackers. Note to Practitioners-Visual tracking can be applied in many intelligent automation systems, such as robotic intelligent navigation system, intelligent human-computer interaction system, and so on. In a robotic intelligent navigation system, visual tracking can generate target's motion trajectory from image sequences. Visual tracking can also obtain body movement information automatically during the interactive process in the intelligent human-computer interaction system. Accuracy and speed are two key indicators for visual tracking, and intelligent automation systems usually need a tracker with more accuracy and faster speed. This article aims to develop a fast and accurate tracking method by adversarial feature sampling learning (AFSL). In the concrete implementation process, AFSL gets samples by sampling in the feature space rather than on raw images to reduce computation. Then, an adversarial learning mechanism is adopted to boost the sampling features and enrich the target appearances in the feature space to improve the tracking accuracy. The proposed tracker is proven to be effective to keep leading tracking accuracy while significantly accelerating the tracking speed.

引用

页码：847 / 857

页数：11

共 52 条

[1] [Anonymous], 2016, ARXIV PREPRINT ARXIV
[2] [Anonymous], 2015, PROC CVPR IEEE
[3] [Anonymous], P WORKSH VIS OBJ TRA
[4] [Anonymous], 2017, P IEEE INT C COMPUTE
[5] Staple: Complementary Learners for Real-Time Tracking
Bertinetto, Luca
Valmadre, Jack
Golodetz, Stuart
Miksik, Ondrej
Torr, Philip H. S.
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1401 - 1409
[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[7] Attentional Correlation Filter Network for Adaptive Visual Tracking
Choi, Jongwon
Chang, Hyung Jin
Yun, Sangdoo
Fischer, Tobias
Demiris, Yiannis
Choi, Jin Young
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4828 - 4837
[8] Choi W, 2012, LECT NOTES COMPUT SC, V7575, P215, DOI 10.1007/978-3-642-33765-9_16
[9] Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036
[10] Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
Danelljan, Martin
Hager, Gustav
Khan, Fahad Shahbaz
Felsberg, Michael
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1430 - 1438

← 1 2 3 4 5 6 →