An Improved Aggregated-Mosaic Method for the Sparse Object Detection of Remote Sensing Imagery

被引:23
作者
Zhao, Boya [1 ]
Wu, Yuanfeng [1 ,2 ]
Guan, Xinran [1 ,2 ]
Gao, Lianru [1 ,3 ]
Zhang, Bing [1 ,3 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Digital Earth Sci, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Coll Resources & Environm, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
remote sensing; object detection; data augmentation; sparse distribution; YOLOv5;
D O I
10.3390/rs13132602
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Object detection based on remote sensing imagery has become increasingly popular over the past few years. Unlike natural images taken by humans or surveillance cameras, the scale of remote sensing images is large, which requires the training and inference procedure to be on a cutting image. However, objects appearing in remote sensing imagery are often sparsely distributed and the labels for each class are imbalanced. This results in unstable training and inference. In this paper, we analyze the training characteristics of the remote sensing images and propose the fusion of the aggregated-mosaic training method, with the assigned-stitch augmentation and auto-target-duplication. In particular, based on the ground truth and mosaic image size, the assigned-stitch augmentation enhances each training sample with an appropriate account of objects, facilitating the smooth training procedure. Hard to detect objects, or those in classes with rare samples, are randomly selected and duplicated by the auto-target-duplication, which solves the sample imbalance or classes with insufficient results. Thus, the training process is able to focus on weak classes. We employ VEDAI and NWPU VHR-10, remote sensing datasets with sparse objects, to verify the proposed method. The YOLOv5 adopts the Mosaic as the augmentation method and is one of state-of-the-art detectors, so we choose Mosaic (YOLOv5) as the baseline. Results demonstrate that our method outperforms Mosaic (YOLOv5) by 2.72% and 5.44% on 512 x 512 and 1024 x 1024 resolution imagery, respectively. Moreover, the proposed method outperforms Mosaic (YOLOv5) by 5.48% under the NWPU VHR-10 dataset.
引用
收藏
页数:19
相关论文
共 56 条
  • [31] Soft margins for AdaBoost
    Rätsch, G
    Onoda, T
    Müller, KR
    [J]. MACHINE LEARNING, 2001, 42 (03) : 287 - 320
  • [32] Vehicle detection in aerial imagery : A small target detection benchmark
    Razakarivony, Sebastien
    Jurie, Frederic
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2016, 34 : 187 - 203
  • [33] Real E, 2019, AAAI CONF ARTIF INTE, P4780
  • [34] Redmon J, 2018, Arxiv, DOI arXiv:1804.02767
  • [35] YOLO9000: Better, Faster, Stronger
    Redmon, Joseph
    Farhadi, Ali
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6517 - 6525
  • [36] You Only Look Once: Unified, Real-Time Object Detection
    Redmon, Joseph
    Divvala, Santosh
    Girshick, Ross
    Farhadi, Ali
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 779 - 788
  • [37] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    Ren, Shaoqing
    He, Kaiming
    Girshick, Ross
    Sun, Jian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) : 1137 - 1149
  • [38] ImageNet Large Scale Visual Recognition Challenge
    Russakovsky, Olga
    Deng, Jia
    Su, Hao
    Krause, Jonathan
    Satheesh, Sanjeev
    Ma, Sean
    Huang, Zhiheng
    Karpathy, Andrej
    Khosla, Aditya
    Bernstein, Michael
    Berg, Alexander C.
    Fei-Fei, Li
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) : 211 - 252
  • [39] Shafahi A, 2020, AAAI CONF ARTIF INTE, V34, P5636
  • [40] A survey on Image Data Augmentation for Deep Learning
    Shorten, Connor
    Khoshgoftaar, Taghi M.
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)