Improved Architecture and Training Strategies of YOLOv7 for Remote Sensing Image Object Detection

被引：2

作者：

Zhao, Dewei ^{[1
]}

Shao, Faming ^{[1
]}

Liu, Qiang ^{[1
]}

Zhang, Heng ^{[1
]}

Zhang, Zihan ^{[1
]}

Yang, Li ^{[1
]}

机构：

[1] Army Engn Univ PLA, Coll Field Engn, Nanjing 210007, Peoples R China

来源：

REMOTE SENSING | 2024年 / 16卷 / 17期

基金：

中国国家自然科学基金;

关键词：

remote sensing; object detection; improvement; YOLOv7; small object;

D O I：

10.3390/rs16173321

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

The technology for object detection in remote sensing images finds extensive applications in production and people's lives, and improving the accuracy of image detection is a pressing need. With that goal, this paper proposes a range of improvements, rooted in the widely used YOLOv7 algorithm, after analyzing the requirements and difficulties in the detection of remote sensing images. Specifically, we strategically remove some standard convolution and pooling modules from the bottom of the network, adopting stride-free convolution to minimize the loss of information for small objects in the transmission. Simultaneously, we introduce a new, more efficient attention mechanism module for feature extraction, significantly enhancing the network's semantic extraction capabilities. Furthermore, by adding multiple cross-layer connections in the network, we more effectively utilize the feature information of each layer in the backbone network, thereby enhancing the network's overall feature extraction capability. During the training phase, we introduce an auxiliary network to intensify the training of the underlying network and adopt a new activation function and a more efficient loss function to ensure more effective gradient feedback, thereby elevating the network performance. In the experimental results, our improved network achieves impressive mAP scores of 91.2% and 80.8% on the DIOR and DOTA version 1.0 remote sensing datasets, respectively. These represent notable improvements of 4.5% and 7.0% over the original YOLOv7 network, significantly enhancing the efficiency of detecting small objects in particular.

引用

页数：32

共 60 条

[1] Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review
Amjoud, Ayoub Benali
Amrouch, Mustapha
[J]. IEEE ACCESS, 2023, 11 : 35479 - 35516
[2] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3] Enhanced Campus Security Target Detection Using a Refined YOLOv7 Approach
Cao, Fengyun
Ma, Shuai
[J]. TRAITEMENT DU SIGNAL, 2023, 40 (05) : 2267 - 2273
[4] A survey on object detection in optical remote sensing images
Cheng, Gong
Han, Junwei
[J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 : 11 - 28
[5] Skip Connection YOLO Architecture for Noise Barrier Defect Detection Using UAV-Based Images in High-Speed Railway
Cui, Jing
Qin, Yong
Wu, Yunpeng
Shao, Changhong
Yang, Huaizhi
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12180 - 12195
[6] Single-Stage UAV Detection and Classification with YOLOV5: Mosaic Data Augmentation and PANet
Dadboud, Fardad
Patel, Vaibhav
Mehta, Varun
Bolic, Miodrag
Mantegh, Iraj
[J]. 2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,
[7] Histograms of oriented gradients for human detection
Dalal, N
Triggs, B
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
[8] Dengmei Wen, 2024, Artificial Intelligence in China: Proceedings of the 5th International Conference on Artificial Intelligence in China. Lecture Notes in Electrical Engineering (1043), P249, DOI 10.1007/978-981-99-7545-7_26
[9] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results
Du, Dawei
Zhu, Pengfei
Wen, Longyin
Bian, Xiao
Ling, Haibin
Hu, Qinghua
Peng, Tao
Zheng, Jiayu
Wang, Xinyao
Zhang, Yue
Bo, Liefeng
Shi, Hailin
Zhu, Rui
Kumar, Aashish
Li, Aijin
Zinollayev, Almaz
Askergaliyev, Anuar
Schumann, Arne
Mao, Binjie
Lee, Byeongwon
Liu, Chang
Chen, Changrui
Pan, Chunhong
Huo, Chunlei
Yu, Da
Cong, Dechun
Zeng, Dening
Pailla, Dheeraj Reddy
Li, Di
Wang, Dong
Cho, Donghyeon
Zhang, Dongyu
Bai, Furui
Jose, George
Gao, Guangyu
Liu, Guizhong
Xiong, Haitao
Qi, Hao
Wang, Haoran
Qiu, Heqian
Li, Hongliang
Lu, Huchuan
Kim, Ildoo
Kim, Jaekyum
Shen, Jane
Lee, Jihoon
Ge, Jing
Xu, Jingjing
Zhou, Jingkai
Meier, Jonas
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 213 - 226
[10] CenterNet: Keypoint Triplets for Object Detection
Duan, Kaiwen
Bai, Song
Xie, Lingxi
Qi, Honggang
Huang, Qingming
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577

← 1 2 3 4 5 6 →