DANet: Multi-scale UAV Target Detection with Dynamic Feature Perception and Scale-aware Knowledge Distillation

被引:1
作者
Fang, Houzhang [1 ]
Liao, Zikai [1 ]
Wang, Lu [1 ]
Li, Qingshan [1 ]
Chang, Yi [2 ]
Yan, Luxin [2 ]
Wang, Xuhua [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Unmanned aerial vehicle; multi-scale infrared target detection; attention mechanism; contrastive learning; knowledge distillation;
D O I
10.1145/3581783.3612146
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-scale infrared unmanned aerial vehicle ( UAV) targets (IRUTs) detection under dynamic scenarios remains a challenging task due to weak target features, varying shapes and poses, and complex background interference. Current detection methods find it difficult to address the above issues accurately and efficiently. In this paper, we design a dynamic attentive network (DANet) incorporating a scale-adaptive feature enhancement mechanism (SaFEM) and an attention-guided cross-weighting feature aggregator (ACFA). The SaFEM adaptively adjusts the network's receptive fields at hierarchical network levels leveraging separable deformable convolution (SDC), which enhances the network's multi-scale IRUT awareness. The ACFA, modulated by two crossing attention mechanisms, strengthens structural and semantic properties on neighboring levels for the accurate representation of multi-scale IRUT features from different levels. A plug-and-play anti-distractor contrastive regularization (ADCR) is also imposed on our DANet, which enforces similarity on features of targets and distractors from a new uncompressed feature projector (UFP) to increase the network's anti-distractor ability in complex backgrounds. To further increase the multi-scale UAV detection performance of DANet while maintaining its efficiency superiority, we propose a novel scale-specific knowledge distiller (SSKD) based on a divide-and-conquer strategy. For the "divide" stage, we intendedly construct three task-oriented teachers to learn tailored knowledge for small-, medium-, and largescale IRUTs. For the "conquer" stage, we propose a novel elementwise attentive distillation module (EADM), where we employ a pixel-wise attention mechanism to highlight teacher and student IRUT features, and incorporate IRUT-associated prior knowledge for the collaborative transfer of refined multi-scale IRUT features to our DANet. Extensive experiments on real infrared UAV datasets demonstrate that our DANet is able to detect multi-scale UAVs with a satisfactory balance between accuracy and efficiency.
引用
收藏
页码:2121 / 2130
页数:10
相关论文
共 52 条
  • [21] PRF-RW: a progressive random forest-based random walk approach for interactive semi-automated pulmonary lobes segmentation
    Li, Qiang
    Chen, Lei
    Li, Xiangju
    Lv, Xiaofeng
    Xia, Shuyue
    Kang, Yan
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (10) : 2221 - 2235
  • [22] Knowledge Distillation via the Target-aware Transformer
    Lin, Sihao
    Xie, Hongwei
    Wang, Bing
    Yu, Kaicheng
    Chang, Xiaojun
    Liang, Xiaodan
    Wang, Gang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10905 - 10914
  • [23] Adaptive multi-teacher multi-level knowledge distillation
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    [J]. NEUROCOMPUTING, 2020, 415 : 106 - 113
  • [24] LUGO CA, 2023, IEEE GEOSCI REMOTE S, V20
  • [25] Progressive Cross-modal Knowledge Distillation for Human Action Recognition
    Ni, Jianyuan
    Ngu, Anne H. H.
    Yan, Yan
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5903 - 5912
  • [26] Channel-wise Knowledge Distillation for Dense Prediction
    Shu, Changyong
    Liu, Yifan
    Gao, Jianfei
    Yan, Zheng
    Shen, Chunhua
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5291 - 5300
  • [27] Son W., 2021, P IEEECVF INT C COMP, P9395
  • [28] Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
    Sun, Jingxian
    Zhang, Lichao
    Zha, Yufei
    Gonzalez-Garcia, Abel
    Zhang, Peng
    Huang, Wei
    Zhang, Yanning
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2262 - 2270
  • [29] Patch-based Knowledge Distillation for Lifelong Person Re-Identification
    Sun, Zhicheng
    Mu, Yadong
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [30] A dataset for multi-sensor drone detection
    Svanstrom, Fredrik
    Alonso-Fernandez, Fernando
    Englund, Cristofer
    [J]. DATA IN BRIEF, 2021, 39