PHSI-RTDETR: A Lightweight Infrared Small Target Detection Algorithm Based on UAV Aerial Photography

被引:29
作者
Wang, Sen [1 ,2 ]
Jiang, Huiping [1 ,2 ]
Li, Zhongjie [1 ,2 ]
Yang, Jixiang [1 ,2 ]
Ma, Xuan [1 ,2 ]
Chen, Jiamin [1 ,2 ]
Tang, Xingqun [1 ,2 ]
机构
[1] Governance MOE, Key Lab Ethn Language Intelligent Anal & Secur, Beijing 100081, Peoples R China
[2] Minzu Univ China, Sch Informat Engn, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
small infrared target; UAV; RT-DETR; lightweight structure; partial convolution; HiLo attention; slimneck; Inner-GIoU;
D O I
10.3390/drones8060240
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
To address the issues of low model accuracy caused by complex ground environments and uneven target scales and high computational complexity in unmanned aerial vehicle (UAV) aerial infrared image target detection, this study proposes a lightweight UAV aerial infrared small target detection algorithm called PHSI-RTDETR. Initially, an improved backbone feature extraction network is designed using the lightweight RPConv-Block module proposed in this paper, which effectively captures small target features, significantly reducing the model complexity and computational burden while improving accuracy. Subsequently, the HiLo attention mechanism is combined with an intra-scale feature interaction module to form an AIFI-HiLo module, which is integrated into a hybrid encoder to enhance the focus of the model on dense targets, reducing the rates of missed and false detections. Moreover, the slimneck-SSFF architecture is introduced as the cross-scale feature fusion architecture of the model, utilizing GSConv and VoVGSCSP modules to enhance adaptability to infrared targets of various scales, producing more semantic information while reducing network computations. Finally, the original GIoU loss is replaced with the Inner-GIoU loss, which uses a scaling factor to control auxiliary bounding boxes to speed up convergence and improve detection accuracy for small targets. The experimental results show that, compared to RT-DETR, PHSI-RTDETR reduces model parameters by 30.55% and floating-point operations by 17.10%. Moreover, detection precision and speed are increased by 3.81% and 13.39%, respectively, and mAP50, impressively, reaches 82.58%, demonstrating the great potential of this model for drone infrared small target detection.
引用
收藏
页数:21
相关论文
共 48 条
[21]   SSD: Single Shot MultiBox Detector [J].
Liu, Wei ;
Anguelov, Dragomir ;
Erhan, Dumitru ;
Szegedy, Christian ;
Reed, Scott ;
Fu, Cheng-Yang ;
Berg, Alexander C. .
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37
[22]   Infrared and visible image fusion methods and applications: A survey [J].
Ma, Jiayi ;
Ma, Yong ;
Li, Chang .
INFORMATION FUSION, 2019, 45 :153-178
[23]   LAGSwin: Local attention guided Swin-transformer for thermal infrared sports object detection [J].
Meng, Hengran ;
Si, Shuqi ;
Mao, Bingfei ;
Zhao, Jia ;
Wu, Liping .
PLOS ONE, 2024, 19 (04)
[24]   Unmanned aerial vehicles (UAVs): practical aspects, applications, open challenges, security issues, and future trends [J].
Mohsan, Syed Agha Hassnain ;
Othman, Nawaf Qasem Hamood ;
Li, Yanlong ;
Alsharif, Mohammed H. H. ;
Khan, Muhammad Asghar .
INTELLIGENT SERVICE ROBOTICS, 2023, 16 (01) :109-137
[25]  
Pan Zizheng, 2022, Advances in Neural Information Processing Systems
[26]   A Survey on Deep Learning: Algorithms, Techniques, and Applications [J].
Pouyanfar, Samira ;
Sadiq, Saad ;
Yan, Yilin ;
Tian, Haiman ;
Tao, Yudong ;
Reyes, Maria Presa ;
Shyu, Mei-Ling ;
Chen, Shu-Ching ;
Iyengar, S. S. .
ACM COMPUTING SURVEYS, 2019, 51 (05)
[27]   Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation [J].
Qi, Yaolei ;
He, Yuting ;
Qi, Xiaoming ;
Zhang, Yuan ;
Yang, Guanyu .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :6047-6056
[28]   You Only Look Once: Unified, Real-Time Object Detection [J].
Redmon, Joseph ;
Divvala, Santosh ;
Girshick, Ross ;
Farhadi, Ali .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :779-788
[29]   Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression [J].
Rezatofighi, Hamid ;
Tsoi, Nathan ;
Gwak, JunYoung ;
Sadeghian, Amir ;
Reid, Ian ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :658-666
[30]  
Samad Abd Manan, 2013, 2013 IEEE 3rd International Conference on System Engineering and Technology (ICSET), P313, DOI 10.1109/ICSEngT.2013.6650191