Learning attention for object tracking with adversarial learning network

被引:0
作者
Xu Cheng
Chen Song
Yongxiang Gu
Beijing Chen
机构
[1] Nanjing University of Information Science and Technology,School of Computer and Software
[2] Nanjing University of Information Science and Technology,Jiangsu Key Laboratory of Big Data Analysis Technology
[3] Engineering Research Center of Digital Forensics,undefined
[4] Ministry of Education,undefined
[5] Nanjing University of Information Science and Technology,undefined
来源
EURASIP Journal on Image and Video Processing | / 2020卷
关键词
Surveillance; Deep learning; Object tracking; Generative adversarial learning;
D O I
暂无
中图分类号
学科分类号
摘要
Artificial intelligence has been widely studied on solving intelligent surveillance analysis and security problems in recent years. Although many multimedia security approaches have been proposed by using deep learning network model, there are still some challenges on their performances which deserve in-depth research. On the one hand, high computational complexity of current deep learning methods makes it hard to be applied to real-time scenario. On the other hand, it is difficult to obtain the specific features of a video by fine-tuning the network online with the object state of the first frame, which fails to capture rich appearance variations of the object. To solve above two issues, in this paper, an effective object tracking method with learning attention is proposed to achieve the object localization and reduce the training time in adversarial learning framework. First, a prediction network is designed to track the object in video sequences. The object positions of the first ten frames are employed to fine-tune prediction network, which can fully mine a specific features of an object. Second, the prediction network is integrated into the generative adversarial network framework, which randomly generates masks to capture object appearance variations via adaptively dropout input features. Third, we present a spatial attention mechanism to improve the tracking performance. The proposed network can identify the mask that maintains the most robust features of the objects over a long temporal span. Extensive experiments on two large-scale benchmarks demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.
引用
收藏
相关论文
共 52 条
[1]  
Qin C(2018)Separable reversible data hiding in encrypted images via adaptive embedding strategy with block selection Signal Processing 153 109-122
[2]  
Zhang W(2019)Text detection and recognition for natural scene images using deep convolutional neural networks Comput Mater Continua 61 289-300
[3]  
Cao F(2015)Object tracking benchmark IEEE Trans. Pattern Anal. Mach. Intell 37 1834-1848
[4]  
Zhang X(2015)High-speed tracking with kernelized correlation filters IEEE Trans. Pattern Anal. Machine Intell. 37 583-596
[5]  
Chang C(2006)Object tracking: a survey ACM Computing Surveys (CSUR) 38 13-2785
[6]  
Wu X(2018)Identifying computer generated images based on quaternion central moments in color quaternion wavelet domain IEEE Trans. Circ. Syst. Video Technol 29 2775-1468
[7]  
Luo C(2014)Visual tracking: an experimental survey IEEE Trans. Pattern Anal. Machine Intell 36 1442-188
[8]  
Zhang Q(2002)A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking IEEE Trans. Signal Processing. 50 174-3297
[9]  
Zhou J(2020)Visual tracking via auto-encoder pair correlation filter IEEE Trans. Industrial Electronics. 67 3288-1187
[10]  
Yang H(2019)YATA: Yet another proposal for traffic analysis and anomaly detection, Computers Mater. Continua. 60 1171-1792