Feature selection accelerated convolutional neural networks for visual tracking

被引:0
作者
Zhiyan Cui
Na Lu
机构
[1] Xi’an Jiaotong University,Systems Engineering Institute
[2] Beijing Advanced Innovation Center for Intelligent Robots and Systems,undefined
来源
Applied Intelligence | 2021年 / 51卷
关键词
Visual tracking; Mutual information; Feature selection; RoIAlign;
D O I
暂无
中图分类号
学科分类号
摘要
Most of the existing tracking methods based on convolutional neural network (CNN) models are too slow for use in real-time applications despite their excellent tracking accuracy in comparison with traditional methods. Meanwhile, CNN tracking solutions are memory intensive and require considerable computational resources. In this paper, we propose a time-efficient and accurate tracking scheme, a feature selection accelerated CNN (FSNet) tracking solution based on MDNet (Multi-Domain Network). The large number of convolutional operations is a major contributor to the high computational cost of MDNet. To reduce the computational complexity, we incorporated an efficient mutual information-based feature selection over the convolutional layer that reduces the feature redundancy in feature maps. Considering that tracking is a typical binary classification problem, redundant feature maps can simply be pruned, which results in an insignificant influence on the tracking performance. To further accelerate the CNN tracking solution, a RoIAlign layer is added that can apply convolution to the entire image instead of just to each RoI (Region of Interest). The bilinear interpolation of RoIAlign could well reduce misalignment errors of the tracked target. In addition, a new fine-tuning strategy is used in the fully-connected layers to accelerate the online updating process. By combining the above strategies, the accelerated CNN achieves a speedup to 60 FPS (Frame Per Second) on the GPU compared with the original MDNet, which functioned at 1 FPS with a very low impact on tracking accuracy. We evaluated the proposed solution on four benchmarks: OTB50, OTB100 ,VOT2016 and UAV123. The extensive comparison results verify the superior performance of FSNet.
引用
收藏
页码:8230 / 8244
页数:14
相关论文
共 102 条
[1]  
Wu Y(2015)Object tracking benchmark IEEE Trans Pattern Anal Mach Intell 37 1834-1848
[2]  
Lim J(2018)Deep visual tracking: Review and experimental comparison Pattern Recogn 76 323-338
[3]  
Yang MH(2020)Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance Knowl.-Based Syst 194 105590-323
[4]  
Li P(2018)Deep learning for fall detection: Three-dimensional cnn combined with lstm on video kinematic data IEEE J Biomed Health Inform 23 314-252
[5]  
Wang D(2015)Imagenet large scale visual recognition challenge Int J Comput Vis 115 211-596
[6]  
Wang L(2004)Estimating mutual information Phys Rev E 69 066138-2846
[7]  
Lu H(2014)High-speed tracking with kernelized correlation filters IEEE Trans Pattern Anal Mach Intell 37 583-166
[8]  
Pérez-Hernández F(2018)Learning spatially correlation filters based on convolutional features via pso algorithm and two combined color spaces for visual tracking Appl Intell 48 2837-5609
[9]  
Tabik S(2020)Adaptive discriminative deep correlation filter for visual object tracking IEEE Trans Circuits Syst Video Technol 30 155-67
[10]  
Lamas A(2019)Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking IEEE Trans Image Process 28 5596-5609