YOLO-SSP: an object detection model based on pyramid spatial attention and improved downsampling strategy for remote sensing images

被引:10
作者
Liu, Yongli [1 ]
Yang, Degang [1 ,2 ]
Song, Tingting [1 ]
Ye, Yichen [3 ]
Zhang, Xin [1 ]
机构
[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing 401331, Peoples R China
[2] Chongqing Engn Res Ctr, Educ Big Data Intelligent Percept & Applicat, Chongqing 401331, Peoples R China
[3] Southwest Univ, Coll Elect & Informat Engn, Chongqing 400715, Peoples R China
关键词
Object detection; Remote sensing images; Small object; Attention mechanism; CONVOLUTIONAL NETWORKS;
D O I
10.1007/s00371-024-03434-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Object detection is an essential task in remote sensing image processing. However, the remote sensing images are characterized by large range of object sizes and complex object backgrounds, which results in challenges in the object detection task. Moreover, the detection effect of existing object detectors on remote sensing images is still not satisfactory. In order to tackle the above problems, an object detection model named YOLO-SSP for remote sensing images is proposed based on the YOLOv8m model in this paper. To begin with, the original downsampling layers are replaced with the proposed lightweight SPD-Conv module, which performs downsampling without loss of fine-grained information and improves the ability of the network to learn the feature representation. In addition, to adapt the large number of small objects in remote sensing images, a small object detection layer is added and achieves the expected results. Finally, a pyramid spatial attention mechanism is proposed to obtain the weights of different spatial positions through hierarchical pooling operations. It effectively improves the detection performance of small objects and those with complex backgrounds. We conducted ablation experiments on the DIOR dataset and compared the YOLO-SSP model with other state-of-the-art models. YOLO-SSP obtains 64.7% of mAP, which is an improvement of 2.3% relative to the baseline model. To demonstrate the generalizability and robustness of the improved model, the comparison experiments are also performed on the TGRS-HRRSD dataset and SIMD dataset with mAP of 77.2 and 64.9%, respectively. The code will be available at https://github.com/YongliLiu/SSP.
引用
收藏
页码:1467 / 1484
页数:18
相关论文
共 44 条
[1]   Physical-Simulation-Based Dynamic Template Matching Method for Remote Sensing Small Object Detection [J].
Cao, Yaming ;
Guo, Lei ;
Xiong, Fengguang ;
Kuang, Liqun ;
Han, Xie .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-14
[2]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[3]   Open water detection in urban environments using high spatial resolution remote sensing imagery [J].
Chen, Fen ;
Chen, Xingzhuang ;
Van de Voorde, Tim ;
Roberts, Dar ;
Jiang, Huajun ;
Xu, Wenbo .
REMOTE SENSING OF ENVIRONMENT, 2020, 242
[4]  
Cheng G., 2021, IEEE T GEOSCI ELECT, V60, P10
[5]   Context-Aware Block Net for Small Object Detection [J].
Cui, Lisha ;
Lv, Pei ;
Jiang, Xiaoheng ;
Gao, Zhimin ;
Zhou, Bing ;
Zhang, Luming ;
Shao, Ling ;
Xu, Mingliang .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) :2300-2313
[6]   Satellite Remote Sensing and Non-Destructive Testing Methods for Transport Infrastructure Monitoring: Advances, Challenges and Perspectives [J].
Gagliardi, Valerio ;
Tosti, Fabio ;
Ciampoli, Luca Bianchini ;
Battagliere, Maria Libera ;
D'Amato, Luigi ;
Alani, Amir M. ;
Benedetto, Andrea .
REMOTE SENSING, 2023, 15 (02)
[7]   Region-Based Convolutional Networks for Accurate Object Detection and Segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (01) :142-158
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]  
Glenn J., 2022, GITHUBULTRALYTICSYOL
[10]  
Glenn J., 2023, Ultralytics YOLOv8