MFFCI-YOLOv8: A Lightweight Remote Sensing Object Detection Network Based on Multiscale Features Fusion and Context Information

被引:4
作者
Xu, Sheng [1 ]
Song, Lin [2 ]
Yin, Junru [2 ]
Chen, Qiqiang [2 ]
Zhan, Tianming [3 ]
Huang, Wei [2 ]
机构
[1] Zhengzhou Univ Light Ind, Sch Elect Informat, Zhengzhou 450002, Peoples R China
[2] Zhengzhou Univ Light Ind, Sch Comp Sci & Technol, Zhengzhou 450002, Peoples R China
[3] Nanjing Audit Univ, Jiangsu Modern Intelligent Audit Integrated Applic, Sch Comp Sci, Nanjing 211815, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Accuracy; YOLO; Computational modeling; Convolution; Remote sensing; Neck; Logic gates; Sensors; Hardware; Light weight; object detection; remote sensing images (RSIs); YOLOv8s;
D O I
10.1109/JSTARS.2024.3474689
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most current researches primarily focus on improving experimental accuracy using large models, often neglecting the deployment challenges. There is a growing need for lightweight algorithms in certain remote sensing devices. Moreover, remote sensing images (RSIs) often contain numerous small, densely distributed targets, which pose significant detection challenges. To address these issues, we have improved the YOLOv8s network and developed a lightweight remote sensing object detection (RSOD) network based on multiscale features fusion and context information (MFFCI-YOLOv8). This network combines multiscale feature fusion and contextual information to accurately detect objects in RSIs. First, we introduce the lightweight CSP bottleneck with attention module, which utilizes partial convolution calculation and SimAM attention mechanisms to decrease the number of parameters and computational complexity while enhancing feature extraction capabilities. Second, we design the gate spatial pyramid pooling fast module to enhance the model's perception of scale and contextual information, thus improving the detection of small objects. Last, we employ the multiscale fusion lightweight neck module for more efficient multiscale feature fusion, preventing the loss of small objects. Compared to YOLOv8s, our overall model reduces the number of parameters by 7.7% and FLOPs by 11.9%. We validated the accuracy of MFFCI-YOLOv8 on two remote sensing datasets, NWPU VHR-10 and VisDrone. The experimental results demonstrate that our model offers a low computational cost and high detection accuracy compared to other RSOD models and other YOLO models.
引用
收藏
页码:19743 / 19755
页数:13
相关论文
共 55 条
[21]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[22]  
Li CY, 2022, Arxiv, DOI [arXiv:2209.02976, DOI 10.48550/ARXIV.2209.02976]
[23]   Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images [J].
Li, Rui ;
Zheng, Shunyi ;
Zhang, Ce ;
Duan, Chenxi ;
Su, Jianlin ;
Wang, Libo ;
Atkinson, Peter M. .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[24]  
Li Xiang, 2020, Advances in Neural Information Processing Systems, V33
[25]   A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition [J].
Li, Yiting ;
Fan, Qingsong ;
Huang, Haisong ;
Han, Zhenggong ;
Gu, Qiang .
DRONES, 2023, 7 (05)
[26]   Feature Pyramid Networks for Object Detection [J].
Lin, Tsung-Yi ;
Dollar, Piotr ;
Girshick, Ross ;
He, Kaiming ;
Hariharan, Bharath ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944
[27]   Path Aggregation Network for Instance Segmentation [J].
Liu, Shu ;
Qi, Lu ;
Qin, Haifang ;
Shi, Jianping ;
Jia, Jiaya .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8759-8768
[28]   SSD: Single Shot MultiBox Detector [J].
Liu, Wei ;
Anguelov, Dragomir ;
Erhan, Dumitru ;
Szegedy, Christian ;
Reed, Scott ;
Fu, Cheng-Yang ;
Berg, Alexander C. .
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37
[29]   ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design [J].
Ma, Ningning ;
Zhang, Xiangyu ;
Zheng, Hai-Tao ;
Sun, Jian .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :122-138
[30]   Multiresolution gray-scale and rotation invariant texture classification with local binary patterns [J].
Ojala, T ;
Pietikäinen, M ;
Mäenpää, T .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (07) :971-987