POD-YOLO Object Detection Model Based on Bi-directional Dynamic Cross-level Pyramid Network

被引：0

作者：

Zhang, Yu ^{[1
]}

Ma, Ming ^{[1
]}

Wang, Zhongxiang ^{[1
]}

Li, Jing ^{[1
]}

Sun, Yan ^{[1
]}

机构：

[1] Shenyang Univ Chem Technol, Sch Comp Sci & Technol, Shenyang, Peoples R China

来源：

ENGINEERING LETTERS | 2024年 / 32卷 / 05期

关键词：

Image processing; Object detection; Feature pyramid; Multi-scale; Feature fusion;

D O I：

暂无

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The existing heavy-backbone object detection models overlook the crucial role of cross-level interactive fusion of feature information in pyramid networks, resulting in the inability to detect occluded objects or small objects in complex scenes. In this thesis, we present a new heavy-neck object detection model called POD-YOLO based on YOLOv5s. Firstly, we propose the POD-RepC3 module to increase the model's capability to obtain the multi-layer feature. Additionally, addressing the issue of large object size span, we propose a bidirectional partial dynamic fusion module (Bi-PDC) as the detection neck of the pyramid network. This module preserves the accurate positioning signals and facilitates cross-level interactive fusion of feature information. Finally, we design Reparameterized Bi-directional Dynamic Feature Pyramid Network (RepBi-DFPN), a deep feature fusion network that integrates contextual information and enhances both feature expression and fusion capabilities of our model. The experiment results suggest that the suggested method is positive on the PASCAL VOC dataset. The mAP@0.5 and mAP@0.5 :0.95 performance reached 81.3% and 58.2%, respectively, which increased by 2.4% and 4.1% compared to original algorithm YOLOv5s. Furthermore, experiment results also demonstrate that model's performance can compete with SOTA object detection models. In this paper, the algorithm optimizes the feature fusion capability of the pyramid network to effectively decrease the false detection and missing detection of the model. The model's ability to accurately detect multi-scale targets is significantly improved.

引用

页码：995 / 1003

页数：9

共 25 条

[21]

Ultralytics, "Yolov5,"

[22] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [J].

Wang, Chien-Yao ;

Bochkovskiy, Alexey ;

Liao, Hong-Yuan Mark .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7464-7475

[23] CSPNet: A New Backbone that can Enhance Learning Capability of CNN [J].

Wang, Chien-Yao ;

Liao, Hong-Yuan Mark ;

Wu, Yueh-Hua ;

Chen, Ping-Yang ;

Hsieh, Jun-Wei ;

Yeh, I-Hau .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :1571-1580

[24] YOLO-SK: A lightweight multiscale object detection algorithm [J].

Wang, Shihang ;

Hao, Xiaoli .

HELIYON, 2024, 10 (02)

[25]

Zhang M., 2023, IAENG International Journal of Computer Science, V50, P86

← 1 2 3 →