Small object detection model for UAV aerial image based on YOLOv7

被引：7

作者：

Chen, Jinguang ^{[1
]}

Wen, Ronghui ^{[1
]}

Ma, Lili ^{[1
]}

机构：

[1] Xian Polytech Univ, Sch Comp Sci, Shaanxi Key Lab Clothing Intelligence, Xian 710048, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 03期

关键词：

UAV image detection; Small object detection; YOLOv7; Swin transformer; Detection head;

D O I：

10.1007/s11760-023-02941-0

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Unmanned Aerial Vehicle (UAV) aerial image target detection mainly faces the problems of small targets and target occlusion. In order to improve detection accuracy while maintaining efficiency, this work introduces a UAV aerial image small object detection model based on the real-time detector YOLOv7(SOD-YOLOv7). To address the challenge of small object detection, we have designed a module that combines Swin Transformer and convolution to better capture the global context information of small objects in the image. Additionally, we have introduced the Bi-Level Routing Attention (BRA) mechanism to enhance the model's focus on small objects. To improve the model's detection capabilities at multiple scales, we have added detection branches. For the issue of detecting occluded objects, we have incorporated a dynamic detection head with deformable convolution and attention mechanisms to enhance the model's spatial awareness of targets. The experimental results on the VisDrone and CARPK unmanned aerial vehicle image datasets show that the average precision (mAP@0.5) of our model reaches 53.2% and 98.5%, respectively. Compared to the original YOLOv7 method, our model achieves an improvement of 4.3% and 0.3%, demonstrating better performance in detecting small objects. The code will be soon released at https://github.com/Gentle-Hui/SOD-YOLOv7.

引用

页码：2695 / 2707

页数：13

共 42 条

[1]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[2] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[3] Underwater-YCC: Underwater Target Detection Optimization Algorithm Based on YOLOv7 [J].

Chen, Xiao ;

Yuan, Mujiahui ;

Yang, Qi ;

Yao, Haiyang ;

Wang, Haiyan .

JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (05)

[4] Enhanced semantic feature pyramid network for small object detection [J].

Chen, Yuqi ;

Zhu, Xiangbin ;

Li, Yonggang ;

Wei, Yuanwang ;

Ye, Lihua .

SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 113

[5] Dynamic Head: Unifying Object Detection Heads with Attentions [J].

Dai, Xiyang ;

Chen, Yinpeng ;

Xiao, Bin ;

Chen, Dongdong ;

Liu, Mengchen ;

Yuan, Lu ;

Zhang, Lei .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7369-7378

[6] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs [J].

Ding, Xiaohan ;

Zhang, Xiangyu ;

Han, Jungong ;

Ding, Guiguang .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11953-11965

[7] RepVGG: Making VGG-style ConvNets Great Again [J].

Ding, Xiaohan ;

Zhang, Xiangyu ;

Ma, Ningning ;

Han, Jungong ;

Ding, Guiguang ;

Sun, Jian .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737

[8]

Dosovitskiy A, 2021, INT C LEARN REPR ICL

[9] VisDrone-SOT2019: The Vision Meets Drone Single Object Tracking Challenge Results [J].

Du, Dawei ;

Zhu, Pengfei ;

Wen, Longyin ;

Bian, Xiao ;

Ling, Haibin ;

Hu, Qinghua ;

Zheng, Jiayu ;

Peng, Tao ;

Wang, Xinyao ;

Zhang, Yue ;

Bo, Liefeng ;

Shi, Hailin ;

Zhu, Rui ;

Han, Bo ;

Zhang, Chunhui ;

Liu, Guizhong ;

Wu, Han ;

Wen, Hao ;

Wang, Haoran ;

Fan, Jiaqing ;

Chen, Jie ;

Gao, Jie ;

Zhang, Jie ;

Zhou, Jinghao ;

Zhou, Jinliu ;

Wang, Jinwang ;

Wan, Jiuqing ;

Kittler, Josef ;

Zhang, Kaihua ;

Huang, Kaiqi ;

Yang, Kang ;

Zhang, Kangkai ;

Huang, Lianghua ;

Zhou, Lijun ;

Shi, Lingling ;

Ding, Lu ;

Wang, Ning ;

Wang, Peng ;

Hu, Qintao ;

Laganiere, Robert ;

Ma, Ruiyan ;

Zhang, Ruohan ;

Zou, Shanrong ;

Zhao, Shengwei ;

Li, Shengyang ;

Zhu, Shengyin ;

Li, Shikun ;

Ge, Shiming ;

Xuan, Shiyu ;

Xu, Tianyang .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :199-212

[10]

Ge Z., 2021, ARXIV, DOI [10.48550/ARXIV.2107.08430, 10.48550/arXiv.2107.08430]

← 1 2 3 4 5 →