Improved Small Object Detection Algorithm Based on YOLOv5

被引：1

作者：

Xu, Bo ^{[1
]}

Gao, Bin ^{[2
]}

Li, Yunhu ^{[3
]}

机构：

[1] Inspur Elect Informat Ind Co Ltd, Jinan 250101, Peoples R China

[2] Harbin Inst Technol, Shenzhen 518055, Peoples R China

[3] OMRON Ind Automat China Co Ltd, Shanghai 200120, Peoples R China

来源：

IEEE INTELLIGENT SYSTEMS | 2024年 / 39卷 / 05期

关键词：

Feature extraction; YOLO; Head; Semantics; Intelligent systems; Neck; Remote sensing;

D O I：

10.1109/MIS.2024.3399053

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

YOLOv5 is a popular object detection algorithm that is widely used in various industrial fields, especially in the field of autonomous driving. However, this algorithm has problems, such as false positives and false negatives when detecting small targets. The article proposes an improved method for small object detection using YOLOv5s. First, a multilevel feature fusion detection head is proposed to extract larger feature maps from the backbone of the model, improving the ability to extract features of small objects. Second, a decoupled attention mechanism is introduced at each detection head, which separates the detection of object box position, object box confidence, and class probability to reduce confusion between different feature information. Finally, the focal minimum points distance intersection over union loss function is adopted to mitigate the effects of class imbalance and poor-quality object pixels.

引用

页码：57 / 65

页数：9

共 19 条

[1] Finding Tiny Faces in the Wild with Generative Adversarial Network [J].

Bai, Yancheng ;

Zhang, Yongqiang ;

Ding, Mingli ;

Ghanem, Bernard .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :21-30

[2] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[3] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[4]

Houben S, 2013, IEEE INT C INTELL TR, P7, DOI 10.1109/ITSC.2013.6728595

[5]

Hu J., 2018, P IEEE C COMP VIS PA, P7132

[6]

Li CY, 2022, Arxiv, DOI [arXiv:2209.02976, DOI 10.48550/ARXIV.2209.02976]

[7]

Li Kecen, 2022, Journal of Frontiers of Computer Science and Technology, V16, P41, DOI 10.3778/j.issn.1673-9418.2110003

[8] Coupled Network for Robust Pedestrian Detection With Gated Multi-Layer Feature Extraction and Deformable Occlusion Handling [J].

Liu, Tianrui ;

Luo, Wenhan ;

Ma, Lin ;

Huang, Jun-Jie ;

Stathaki, Tania ;

Dai, Tianhong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :754-766

[9] SSD: Single Shot MultiBox Detector [J].

Liu, Wei ;

Anguelov, Dragomir ;

Erhan, Dumitru ;

Szegedy, Christian ;

Reed, Scott ;

Fu, Cheng-Yang ;

Berg, Alexander C. .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37

[10]

Ma SL, 2023, Arxiv, DOI [arXiv:2307.07662, 10.48550/arXiv.2307.07662]

← 1 2 →