Vis-YOLO: a lightweight and efficient image detector for unmanned aerial vehicle small objects

被引：0

作者：

Deng, Xiangyu ^{[1
]}

Du, Jiangyong ^{[1
]}

机构：

[1] Northwest Normal Univ, Coll Phys & Elect Engn, Lanzhou, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2024年 / 33卷 / 05期

关键词：

small objects; YOLOv8s; lightweight and efficient; unmanned aerial vehicle;

D O I：

10.1117/1.JEI.33.5.053003

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Yolo series models are extensive within the domain of object detection. Aiming at the challenge of small object detection, we analyze the limitations of existing detection models and propose a Vis-YOLO object detection algorithm based on YOLOv8s. First, the down-sampling times are reduced to retain more features, and the detection head is replaced to adapt to the small object. Then, deformable convolutional networks are used to improve the C2f module, improving its feature extraction ability. Finally, the separation and enhancement attention module is introduced to the model to give more weight to the useful information. Experiments show that the improved Vis-YOLO model outperforms the YOLOv8s model on the visdrone2019 dataset. The precision improved by 5.4%, the recall by 6.3%, and the mAP50 by 6.8%. Moreover, Vis-YOLO models are smaller and suitable for mobile deployment. This research provides a new method and idea for small object detection, which has excellent potential application value. (c) 2024 SPIE and IS&T

引用

页数：15

共 32 条

[11] Huang J, 2024, Arxiv, DOI arXiv:2401.08017
[12] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
[13] Gradient-based learning applied to document recognition
Lecun, Y
Bottou, L
Bengio, Y
Haffner, P
[J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
[14] Li CY, 2022, Arxiv, DOI arXiv:2209.02976
[15] DENS-YOLOv6: a small object detection model for garbage detection on water surface
Li, Ning
Wang, Mingliang
Yang, Gaochao
Li, Bo
Yuan, Baohua
Xu, Shoukun
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 55751 - 55771
[16] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Li, Xiang
Wang, Wenhai
Hu, Xiaolin
Li, Jun
Tang, Jinhui
Yang, Jian
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11627 - 11636
[17] Feature Pyramid Networks for Object Detection
Lin, Tsung-Yi
Dollar, Piotr
Girshick, Ross
He, Kaiming
Hariharan, Bharath
Belongie, Serge
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
[18] Path Aggregation Network for Instance Segmentation
Liu, Shu
Qi, Lu
Qin, Haifang
Shi, Jianping
Jia, Jiaya
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8759 - 8768
[19] SSD: Single Shot MultiBox Detector
Liu, Wei
Anguelov, Dragomir
Erhan, Dumitru
Szegedy, Christian
Reed, Scott
Fu, Cheng-Yang
Berg, Alexander C.
[J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 21 - 37
[20] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002

← 1 2 3 4 →