PHSI-RTDETR: A Lightweight Infrared Small Target Detection Algorithm Based on UAV Aerial Photography

被引：29

作者：

Wang, Sen ^{[1
,2
]}

Jiang, Huiping ^{[1
,2
]}

Li, Zhongjie ^{[1
,2
]}

Yang, Jixiang ^{[1
,2
]}

Ma, Xuan ^{[1
,2
]}

Chen, Jiamin ^{[1
,2
]}

Tang, Xingqun ^{[1
,2
]}

机构：

[1] Governance MOE, Key Lab Ethn Language Intelligent Anal & Secur, Beijing 100081, Peoples R China

[2] Minzu Univ China, Sch Informat Engn, Beijing 100081, Peoples R China

来源：

DRONES | 2024年 / 8卷 / 06期

基金：

中国国家自然科学基金;

关键词：

small infrared target; UAV; RT-DETR; lightweight structure; partial convolution; HiLo attention; slimneck; Inner-GIoU;

D O I：

10.3390/drones8060240

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

To address the issues of low model accuracy caused by complex ground environments and uneven target scales and high computational complexity in unmanned aerial vehicle (UAV) aerial infrared image target detection, this study proposes a lightweight UAV aerial infrared small target detection algorithm called PHSI-RTDETR. Initially, an improved backbone feature extraction network is designed using the lightweight RPConv-Block module proposed in this paper, which effectively captures small target features, significantly reducing the model complexity and computational burden while improving accuracy. Subsequently, the HiLo attention mechanism is combined with an intra-scale feature interaction module to form an AIFI-HiLo module, which is integrated into a hybrid encoder to enhance the focus of the model on dense targets, reducing the rates of missed and false detections. Moreover, the slimneck-SSFF architecture is introduced as the cross-scale feature fusion architecture of the model, utilizing GSConv and VoVGSCSP modules to enhance adaptability to infrared targets of various scales, producing more semantic information while reducing network computations. Finally, the original GIoU loss is replaced with the Inner-GIoU loss, which uses a scaling factor to control auxiliary bounding boxes to speed up convergence and improve detection accuracy for small targets. The experimental results show that, compared to RT-DETR, PHSI-RTDETR reduces model parameters by 30.55% and floating-point operations by 17.10%. Moreover, detection precision and speed are increased by 3.81% and 13.39%, respectively, and mAP50, impressively, reaches 82.58%, demonstrating the great potential of this model for drone infrared small target detection.

引用

页数：21

共 48 条

[21] SSD: Single Shot MultiBox Detector [J].