Learning attention for object tracking with adversarial learning network

被引：6

作者：

Cheng, Xu ^{[1
,2
,3
]}

Song, Chen ^{[1
]}

Gu, Yongxiang ^{[1
]}

Chen, Beijing ^{[1
,2
,3
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China

[2] Nanjing Univ Informat Sci & Technol, Jiangsu Key Lab Big Data Anal Technol, Nanjing 210044, Peoples R China

[3] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China

来源：

EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING | 2020年 / 2020卷 / 01期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Surveillance; Deep learning; Object tracking; Generative adversarial learning; RECOGNITION; FILTERS; IMAGES;

D O I：

10.1186/s13640-020-00535-1

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Artificial intelligence has been widely studied on solving intelligent surveillance analysis and security problems in recent years. Although many multimedia security approaches have been proposed by using deep learning network model, there are still some challenges on their performances which deserve in-depth research. On the one hand, high computational complexity of current deep learning methods makes it hard to be applied to real-time scenario. On the other hand, it is difficult to obtain the specific features of a video by fine-tuning the network online with the object state of the first frame, which fails to capture rich appearance variations of the object. To solve above two issues, in this paper, an effective object tracking method with learning attention is proposed to achieve the object localization and reduce the training time in adversarial learning framework. First, a prediction network is designed to track the object in video sequences. The object positions of the first ten frames are employed to fine-tune prediction network, which can fully mine a specific features of an object. Second, the prediction network is integrated into the generative adversarial network framework, which randomly generates masks to capture object appearance variations via adaptively dropout input features. Third, we present a spatial attention mechanism to improve the tracking performance. The proposed network can identify the mask that maintains the most robust features of the objects over a long temporal span. Extensive experiments on two large-scale benchmarks demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.

引用

页数：21

共 62 条

[21]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672

[22] Learning to Track at 100 FPS with Deep Regression Networks [J].

Held, David ;

Thrun, Sebastian ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :749-765

[23] High-Speed Tracking with Kernelized Correlation Filters [J].

Henriques, Joao F. ;

Caseiro, Rui ;

Martins, Pedro ;

Batista, Jorge .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (03) :583-596

[24] Exploiting the Circulant Structure of Tracking-by-Detection with Kernels [J].

Henriques, Joao F. ;

Caseiro, Rui ;

Martins, Pedro ;

Batista, Jorge .

COMPUTER VISION - ECCV 2012, PT IV, 2012, 7575 :702-715

[25] MUlti-Store Tracker (MUSTer): a Cognitive Psychology Inspired Approach to Object Tracking [J].

Hong, Zhibin ;

Chen, Zhe ;

Wang, Chaohui ;

Mei, Xue ;

Prokhorov, Danil ;

Tao, Dacheng .

2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, :749-758

[26]

Hongyi Su, 2015, Intelligent Computing Theories and Methodologies. 11th International Conference, ICIC 2015. Proceedings: LNCS 9225, P1, DOI 10.1007/978-3-319-22180-9_1

[27] A New Time-Aware Collaborative Filtering Intelligent Recommendation System [J].

Jiang, Weijin ;

Chen, Jiahui ;

Jiang, Yirong ;

Xu, Yuhui ;

Wang, Yang ;

Tan, Lina ;

Liang, Guo .

CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 61 (02) :849-859

[28] Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking [J].

Li, Feng ;

Tian, Cheng ;

Zuo, Wangmeng ;

Zhang, Lei ;

Yang, Ming-Hsuan .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4904-4913

[29]

Lukezic A., 2017, Discriminative correlation filter with channel and spatial reliability

[30] Hierarchical Convolutional Features for Visual Tracking [J].

Ma, Chao ;

Huang, Jia-Bin ;

Yang, Xiaokang ;

Yang, Ming-Hsuan .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3074-3082

← 1 2 3 4 5 6 7 →