LI-YOLOv8: Lightweight small target detection algorithm for remote sensing images that combines GSConv and PConv

被引:0
作者
Yan, Pingping [1 ]
Qi, Xiangming [1 ]
Jiang, Liang [2 ]
机构
[1] Liaoning Tech Univ, Sch Software, Huludao, Liaoning, Peoples R China
[2] Tarim Univ, Sch Informat Engn, Alar, Xinjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
OBJECT DETECTION;
D O I
10.1371/journal.pone.0321026
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the domain of remote sensing image small target detection, challenges such as difficulties in extracting features of small targets, complex backgrounds that easily lead to confusion with targets, and high computational complexity with significant resource consumption are prevalent. We propose a lightweight small target detection algorithm for remote sensing images that combines GSConv and PConv, named LI-YOLOv8. Using YOLOv8n as the baseline algorithm, the activation function SiLU in the CBS at the backbone network's SPPF is replaced with ReLU, which reduces interdependencies among parameters. Then, RFAConv is embedded after the first CBS to expand the receptive field and extract more features of small targets. An efficient Multi-Scale Attention (EMA) mechanism is embedded at the terminal of C2f within the neck network to integrate more detailed information, enhancing the focus on small targets. The head network incorporates a lightweight detection head, GP-Detect, which combines GSConv and PConv to decrease the parameter count and computational demand. Integrating Inner-IoU and Wise-IoU v3 to design the Inner-Wise IoU loss function, replacing the original CIoU loss function. This approach provides the algorithm with a gain distribution strategy, focuses on anchor boxes of ordinary quality, and strengthens generalization capability. We conducted ablation and comparative experiments on the public datasets RSOD and NWPU VHR-10. Compared to YOLOv8, our approach achieved improvements of 7.6% and 2.8% in mAP@0.5, and increases of 2.1% and 1.1% in mAP@0.5:0.95. Furthermore, Parameters and GFLOPs were reduced by 10.0% and 23.2%, respectively, indicating a significant enhancement in detection accuracy along with a substantial decrease in both parameters and computational costs. Generalization experiments were conducted on the TinyPerson, LEVIR-ship, brain-tumor, and smoke_fire_1 datasets. The mAP@0.5 metric improved by 2.6%, 5.3%, 2.6%, and 2.3%, respectively, demonstrating the algorithm's robust performance.
引用
收藏
页数:25
相关论文
共 32 条
[1]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]
[2]   Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks [J].
Chen, Jierun ;
Kao, Shiu-Hong ;
He, Hao ;
Zhuo, Weipeng ;
Wen, Song ;
Lee, Chul-Ho ;
Chan, S. -H. Gary .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :12021-12031
[3]   PANET: A CONTEXT BASED PREDICATE ASSOCIATION NETWORK FOR SCENE GRAPH GENERATION [J].
Chen, Yunian ;
Wang, Yanjie ;
Zhang, Yang ;
Guo, Yanwen .
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, :508-513
[4]   Underwater small target detection based on dynamic convolution and attention mechanism [J].
Cheng, Chensheng ;
Wang, Can ;
Yang, Dianyu ;
Wen, Xin ;
Liu, Weidong ;
Zhang, Feihu .
FRONTIERS IN MARINE SCIENCE, 2024, 11
[5]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415
[6]   A survey on object detection in optical remote sensing images [J].
Cheng, Gong ;
Han, Junwei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 :11-28
[7]   Multi-class geospatial object detection and geographic image classification based on collection of part detectors [J].
Cheng, Gong ;
Han, Junwei ;
Zhou, Peicheng ;
Guo, Lei .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 :119-132
[8]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[9]  
Farhadi A., 2017, P IEEE C COMPUTER VI, V7263, P71
[10]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916