Improved Architecture and Training Strategies of YOLOv7 for Remote Sensing Image Object Detection

被引:2
作者
Zhao, Dewei [1 ]
Shao, Faming [1 ]
Liu, Qiang [1 ]
Zhang, Heng [1 ]
Zhang, Zihan [1 ]
Yang, Li [1 ]
机构
[1] Army Engn Univ PLA, Coll Field Engn, Nanjing 210007, Peoples R China
基金
中国国家自然科学基金;
关键词
remote sensing; object detection; improvement; YOLOv7; small object;
D O I
10.3390/rs16173321
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The technology for object detection in remote sensing images finds extensive applications in production and people's lives, and improving the accuracy of image detection is a pressing need. With that goal, this paper proposes a range of improvements, rooted in the widely used YOLOv7 algorithm, after analyzing the requirements and difficulties in the detection of remote sensing images. Specifically, we strategically remove some standard convolution and pooling modules from the bottom of the network, adopting stride-free convolution to minimize the loss of information for small objects in the transmission. Simultaneously, we introduce a new, more efficient attention mechanism module for feature extraction, significantly enhancing the network's semantic extraction capabilities. Furthermore, by adding multiple cross-layer connections in the network, we more effectively utilize the feature information of each layer in the backbone network, thereby enhancing the network's overall feature extraction capability. During the training phase, we introduce an auxiliary network to intensify the training of the underlying network and adopt a new activation function and a more efficient loss function to ensure more effective gradient feedback, thereby elevating the network performance. In the experimental results, our improved network achieves impressive mAP scores of 91.2% and 80.8% on the DIOR and DOTA version 1.0 remote sensing datasets, respectively. These represent notable improvements of 4.5% and 7.0% over the original YOLOv7 network, significantly enhancing the efficiency of detecting small objects in particular.
引用
收藏
页数:32
相关论文
共 60 条
  • [1] Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review
    Amjoud, Ayoub Benali
    Amrouch, Mustapha
    [J]. IEEE ACCESS, 2023, 11 : 35479 - 35516
  • [2] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]
  • [3] Enhanced Campus Security Target Detection Using a Refined YOLOv7 Approach
    Cao, Fengyun
    Ma, Shuai
    [J]. TRAITEMENT DU SIGNAL, 2023, 40 (05) : 2267 - 2273
  • [4] A survey on object detection in optical remote sensing images
    Cheng, Gong
    Han, Junwei
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 : 11 - 28
  • [5] Skip Connection YOLO Architecture for Noise Barrier Defect Detection Using UAV-Based Images in High-Speed Railway
    Cui, Jing
    Qin, Yong
    Wu, Yunpeng
    Shao, Changhong
    Yang, Huaizhi
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12180 - 12195
  • [6] Single-Stage UAV Detection and Classification with YOLOV5: Mosaic Data Augmentation and PANet
    Dadboud, Fardad
    Patel, Vaibhav
    Mehta, Varun
    Bolic, Miodrag
    Mantegh, Iraj
    [J]. 2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,
  • [7] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [8] Dengmei Wen, 2024, Artificial Intelligence in China: Proceedings of the 5th International Conference on Artificial Intelligence in China. Lecture Notes in Electrical Engineering (1043), P249, DOI 10.1007/978-981-99-7545-7_26
  • [9] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results
    Du, Dawei
    Zhu, Pengfei
    Wen, Longyin
    Bian, Xiao
    Ling, Haibin
    Hu, Qinghua
    Peng, Tao
    Zheng, Jiayu
    Wang, Xinyao
    Zhang, Yue
    Bo, Liefeng
    Shi, Hailin
    Zhu, Rui
    Kumar, Aashish
    Li, Aijin
    Zinollayev, Almaz
    Askergaliyev, Anuar
    Schumann, Arne
    Mao, Binjie
    Lee, Byeongwon
    Liu, Chang
    Chen, Changrui
    Pan, Chunhong
    Huo, Chunlei
    Yu, Da
    Cong, Dechun
    Zeng, Dening
    Pailla, Dheeraj Reddy
    Li, Di
    Wang, Dong
    Cho, Donghyeon
    Zhang, Dongyu
    Bai, Furui
    Jose, George
    Gao, Guangyu
    Liu, Guizhong
    Xiong, Haitao
    Qi, Hao
    Wang, Haoran
    Qiu, Heqian
    Li, Hongliang
    Lu, Huchuan
    Kim, Ildoo
    Kim, Jaekyum
    Shen, Jane
    Lee, Jihoon
    Ge, Jing
    Xu, Jingjing
    Zhou, Jingkai
    Meier, Jonas
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 213 - 226
  • [10] CenterNet: Keypoint Triplets for Object Detection
    Duan, Kaiwen
    Bai, Song
    Xie, Lingxi
    Qi, Honggang
    Huang, Qingming
    Tian, Qi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577