Siamese network tracking based on high level semantic embedding

被引:0
|
作者
Pu L. [1 ]
Li H. [1 ]
Hou Z. [2 ]
Feng X. [3 ]
He Y. [1 ]
机构
[1] Combat Support College, Rocket Force University of Engineering, Xi’an
[2] School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an
[3] College of Artificial Intelligence, Yango University, Fuzhou
基金
中国国家自然科学基金;
关键词
computer vision; feature fusion; semantic embedding; Siamese network; visual tracking;
D O I
10.13700/j.bh.1001-5965.2021.0319
中图分类号
学科分类号
摘要
In order to improve the feature expression ability of the Siamese network without deepening the network, a Siamese network tracking algorithm was propose based on high-level semantic embedding. First, a semantic embedding module was designed with convolution and up-sampling operations, which effectively integrated deep features with shallow features, thus achieving the purpose of optimizing shallow features, and this module can be flexibly designed and deployed for any network. Then, under the Siamese network framework, two semantic embedding modules were added between different layers of the AlexNet backbone network. Cyclic optimization was carried out in the offline training stage to gradually transfer the deep semantic information to the shallow feature layer. In the tracking stage, the semantic embedding module was abandoned and the original network structure was adopted. The experimental results show that compared with SiamFC on the OTB2015 data set, the accuracy is improved by 0.102 and the success rate is increased by 0.054. © 2023 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.
引用
收藏
页码:792 / 803
页数:11
相关论文
共 27 条
  • [1] RAWAT W, WANG Z., Deep convolutional neural networks for image classification: A comprehensive review, Neural Computation, 29, 9, pp. 2352-2449, (2017)
  • [2] GIRSHICK R, DONAHUE J, DARRELL T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
  • [3] LONG J, SHELHAMER E, DARRELL T., Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, (2015)
  • [4] SMEULDERS A W M, CHU D M, CUCCHIARA R, Et al., Visual tracking: An experimental survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 7, pp. 1442-1468, (2014)
  • [5] BOLME D S, BEVERIDGE J R, DRAPER B A, Et al., Visual object tracking using adaptive correlation filters, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2544-2550, (2010)
  • [6] PU L, FENG X X, HOU Z Q, Et al., Correlation filter tracking based on deep spatial regularization, Acta Electronica Sinica, 48, 10, pp. 2025-2032, (2020)
  • [7] HENRIQUES J F, RUI C, MARTINS P, Et al., High-speed tracking with kernelized correlation filters, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 3, pp. 583-596, (2015)
  • [8] BERTINETTO L, VALMADRE J, HENRIQUES J F, Et al., Fully-convolutional Siamese networks for object tracking, European Conference on Computer Vision, pp. 850-865, (2016)
  • [9] MA C, HUANG J B, YANG X K, Et al., Hierarchical convolutional features for visual tracking, Proceedings of the IEEE International Conference on Computer Vision, pp. 3074-3082, (2015)
  • [10] WU Y, LIM J, YANG M., Online object tracking: A benchmark, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411-2418, (2013)