Hierarchical Attention Siamese Network for Thermal Infrared Target Tracking

被引:0
作者
Yuan, Di [1 ]
Liao, Donghai [1 ]
Huang, Feng [2 ]
Qiu, Zhaobing [2 ]
Shu, Xiu [3 ]
Tian, Chunwei [4 ,5 ]
Liu, Qiao [6 ]
机构
[1] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510555, Peoples R China
[2] Fuzhou Univ, Sch Mech Engn & Automat, Fuzhou 350108, Peoples R China
[3] Guangzhou Univ, Sch Comp Sci & Cyber Engn, Guangzhou 510006, Peoples R China
[4] Northwestern Polytech Univ, Sch Software, Xian 710072, Shaanxi, Peoples R China
[5] Northwestern Polytech Univ, Yangtze River Delta Res Inst, Taicang 215000, Jiangsu, Peoples R China
[6] Chongqing Normal Univ, Natl Ctr Appl Math, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Target tracking; Feature extraction; Convolution; Training; Interference; Accuracy; Support vector machines; Attention mechanism; feature extraction; feature fusion; Siamese network; thermal infrared (TIR) target tracking;
D O I
10.1109/TIM.2024.3462973
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Thermal infrared (TIR) target tracking is an important topic in the computer vision area. The TIR images are not affected by ambient light and have strong environmental adaptability, making them widely used in battlefield perception, video surveillance, and assisted driving. However, TIR target tracking faces problems such as relatively insufficient information and lack of target texture information, which significantly affects the tracking accuracy of the TIR tracking methods. To solve the above problems, we propose a TIR target tracking method based on a Siamese network with a hierarchical attention mechanism (called: SiamHAN). Specifically, the CIoU Loss is introduced to make full use of the regression box information to calculate the loss function more accurately. The global context network (GCNet) attention mechanism is introduced to reconstruct the feature extraction structure of fine-grained information for the fine-grained information of TIR images. Meanwhile, for the feature information of the hierarchical backbone network of the Siamese network, the ECANet attention mechanism is used for hierarchical feature fusion, so that it can fully utilize the feature information of the multilayer backbone network to represent the target. On the LSOTB-TIR, the hierarchical attention Siamese network achieved a 2.9% increase in success rate and a 4.3% increase in precision relative to the baseline tracker. Experiments show that the proposed SiamHAN method has achieved competitive tracking results on the TIR testing datasets.
引用
收藏
页数:11
相关论文
共 59 条
  • [1] Part-Based Object Tracking Using Multiple Adaptive Correlation Filters
    Barcellos, Pablo
    Scharcanski, Jacob
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70 (70)
  • [2] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [3] Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
  • [4] Cao Y, 2019, IEEE ICC
  • [5] ATOM: Accurate Tracking by Overlap Maximization
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
  • [6] ECO: Efficient Convolution Operators for Tracking
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6931 - 6939
  • [7] Discriminative Scale Space Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) : 1561 - 1575
  • [8] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
    Danelljan, Martin
    Robinson, Andreas
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 472 - 488
  • [9] Learning Spatially Regularized Correlation Filters for Visual Tracking
    Danelljan, Martin
    Hager, Gustav
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4310 - 4318
  • [10] Triplet Loss in Siamese Network for Object Tracking
    Dong, Xingping
    Shen, Jianbing
    [J]. COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 : 472 - 488