YOLO-MFD: Remote Sensing Image Object Detection with Multi-Scale Fusion Dynamic Head

被引:2
作者
Zhang, Zhongyuan [1 ]
Zhu, Wenqiu [1 ]
机构
[1] Hunan Univ Technol, Sch Comp Sci, Zhuzhou 412007, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 79卷 / 02期
关键词
Object detection; YOLOv8; multi; -scale; attention mechanism; dynamic detection head;
D O I
10.32604/cmc.2024.048755
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Remote sensing imagery, due to its high altitude, presents inherent challenges characterized by multiple scales, limited target areas, and intricate backgrounds. These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery. Additionally, these complexities contribute to inaccuracies in target localization and hinder precise target categorization. This paper addresses these challenges by proposing a solution: The YOLO-MFD model (YOLO-MFD: Remote Sensing Image Object Detection with Multi-scale Fusion Dynamic Head). Before presenting our method, we delve into the prevalent issues faced in remote sensing imagery analysis. Specifically, we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds. To resolve these issues, we introduce a novel approach. First, we propose the implementation of a lightweight multi-scale module called CEF. This module significantly improves the model's ability to comprehensively capture important image features by merging multi-scale feature information. It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery. Second, an additional layer of small target detection heads is added, and a residual link is established with the higher-level feature extraction module in the backbone section. This allows the model to incorporate shallower information, significantly improving the accuracy of target localization in remotely sensed images. Finally, a dynamic head attention mechanism is introduced. This allows the model to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes. Consequently, the precision of object detection is significantly improved. The trial results show that the YOLO-MFD model shows improvements of 6.3%, 3.5%, and 2.5% over the original YOLOv8 model in Precision, map@0.5 and map@0.5:0.95, separately. These results illustrate the clear advantages of the method.
引用
收藏
页码:2547 / 2563
页数:17
相关论文
共 31 条
[1]   Computational Intelligence-Based Harmony Search Algorithm for Real-Time Object Detection and Tracking in Video Surveillance Systems [J].
Alotaibi, Maged Faihan ;
Omri, Mohamed ;
Abdel-Khalek, Sayed ;
Khalil, Eied ;
Mansour, Romany F. .
MATHEMATICS, 2022, 10 (05)
[2]  
And L.C, 2021, ULTRALYTICSYOLOV5 V5
[3]   A fast fused part-based model with new deep feature for pedestrian detection and security monitoring [J].
Cheng, Eric Juwei ;
Prasad, Mukesh ;
Yang, Jie ;
Khanna, Pritee ;
Chen, Bing-Hong ;
Tao, Xian ;
Young, Ku-Young ;
Lin, Chin-Teng .
MEASUREMENT, 2020, 151
[4]   Dynamic Head: Unifying Object Detection Heads with Attentions [J].
Dai, Xiyang ;
Chen, Yinpeng ;
Xiao, Bin ;
Chen, Dongdong ;
Liu, Mengchen ;
Yuan, Lu ;
Zhang, Lei .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7369-7378
[5]   VFL: A Verifiable Federated Learning With Privacy-Preserving for Big Data in Industrial IoT [J].
Fu, Anmin ;
Zhang, Xianglong ;
Xiong, Naixue ;
Gao, Yansong ;
Wang, Huaqun ;
Zhang, Jing .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (05) :3316-3326
[6]   Improved YOLOv4 Marine Target Detection Combined with CBAM [J].
Fu, Huixuan ;
Song, Guoqing ;
Wang, Yuchao .
SYMMETRY-BASEL, 2021, 13 (04)
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]  
Guo YH, 2019, AAAI CONF ARTIF INTE, P8368
[9]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[10]   Coordinate Attention for Efficient Mobile Network Design [J].
Hou, Qibin ;
Zhou, Daquan ;
Feng, Jiashi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717