Small Object Detection in Remote Sensing Images Based on Feature Fusion and Attention

被引:22
|
作者
Zhang Yin [1 ,2 ]
Zhu Guiyi [1 ,2 ]
Shi Tianjun [3 ]
Zhang Kun [1 ,2 ]
Yan Junhua [1 ,2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Space Photoelect Detect & Sensing Ind & Informat, Nanjing 211106, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Astronaut, Nanjing 211106, Jiangsu, Peoples R China
[3] Harbin Inst Technol, Res Ctr Space Opt Engn, Harbin 150001, Heilongjiang, Peoples R China
关键词
machine vision; small object detection; remote sensing image; feature fusion; attention mechanism; feature enhancement;
D O I
10.3788/AOS202242.2415001
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
To deal with issues such as less feature information and difficult positioning raised by small object detection in remote sensing images, this paper proposes a remote sensing image small-target detection algorithm FFAM-YOLO (Feature Fusion and Attention Mechanism YOLO) based on feature fusion and attention mechanism. Firstly, in terms of inadequate effective information in backbone network feature extraction and weak information representation in feature maps, the algorithm constructs a feature enhancement module (FEM) to fuse multiple receptive field features in lower-level feature maps and improve the network's ability in extracting object features. Secondly, with low-level and high-level feature maps obtained by the backbone network, the algorithm's low-level and high-level feature fusion structures are rebuilt, and a feature fusion module (FFM) is implemented to enhance the feature information of small targets. Thirdly, small object features are accurately captured by cascade attention mechanism (ESM) consisting of enhanced-efficient channel attention (E-ECA) and spatial attention module (SAM). Finally, the small object is detected in the output dualbranch feature maps, and results are delivered. The experimental results show that with the USOD (Unicorn Small Object Dataset), based on the constructed remote sensing images, the proposed algorithm achieves a precision of 91. 9% and a recall of 83. 5% , with an average precision AP of 89% for intersection ratio threshold (IoU) between the prediction box and the ground truth box of 0.5 and an AP of 32.6% for IoU of 0.5 0.95, respectively, and the detection rate reaches 120 frame/s. The algorithm is with robustness and real-time performance.
引用
收藏
页数:11
相关论文
共 30 条
  • [1] Benjumea A, 2021, Arxiv, DOI [arXiv:2112.11798, 10.48550/arXiv.2112.11798, DOI 10.48550/ARXIV.2112.11798]
  • [2] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
  • [3] Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks
    Duan Zhongjing
    Li Shaobo
    Hu Jianjun
    Yang Jing
    Wang Zheng
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (12)
  • [4] Fu C., 2017, arXiv
  • [5] [高新波 Gao Xinbo], 2021, [数据采集与处理, Journal of Data Acquisition & Processing], V36, P391
  • [6] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [7] Glenn J., 2021, YOLOV5
  • [8] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
  • [9] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
  • [10] Multi-Scale Target Detection Algorithm Based on Attention Mechanism
    Ju Moran
    Luo Jiangning
    Wang Zhongbo
    Luo Haibo
    [J]. ACTA OPTICA SINICA, 2020, 40 (13)