HV-YOLOv8 by HDPconv: Better lightweight detectors for small object detection

被引：1

作者：

Wang, Wei ^{[1
,2
,3
]}

Meng, Yuanze ^{[1
,2
]}

Li, Shun ^{[1
,2
]}

Zhang, Chenghong ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Shenyang Inst Comp Technol, Donghu St, Shenyang 110168, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110169, Peoples R China

来源：

IMAGE AND VISION COMPUTING | 2024年 / 147卷

关键词：

HDPConv; HV-YOLOv8; Small object detection; Lightweighting;

D O I：

10.1016/j.imavis.2024.105052

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurately identifying and localising small objects within images or videos is a critical challenge in the field of computer vision. It is mostly applied in scenarios that require high real-time performance, such as pedestrian detection and autonomous driving scenarios. These tiny targets generally include small objects at long distances, or objects appearing in low-resolution images, due to which it becomes exceptionally difficult to extract effective feature information. Since YOLOv8 with its large downsampling multiplier leads to deeper feature maps that make it difficult to detect tiny objects, we find that the use of residual structures in the convolution module can enhance the accuracy of small object detection. However, this undoubtedly increases the computational cost, so we lightened the convolution module to make it more suitable for practical applications and named it Halved Deep Pointwise Convolution (HDPConv). A cross-level partial module Variety of View Group Shuffle Cross Stage Partial Network (VOV-GSCSP) is also utilised, using a rational architecture as well as multi-scale information fusion, to ensure that the overall model is lightweight while obtaining rich gradient flows. On this basis, we propose a new network lightweight model HV-YOLOv8. In multiple sets of comparative experiments on two datasets (containing several state-of-the-art solutions as well as classical ones), we demonstrate the superiority of HV-YOLOv8, in particular, the accuracy is improved by 1.4% compared to YOLOv8, while the number of parameters and the amount of computation are drastically reduced.

引用

页数：10

共 46 条

[1] Bhartiya K., 2019, Swimming Pool and Car Detection
[2] A full data augmentation pipeline for small object detection based on generative adversarial networks
Bosquet, Brais
Cores, Daniel
Seidenari, Lorenzo
Brea, Victor M.
Mucientes, Manuel
Del Bimbo, Alberto
[J]. PATTERN RECOGNITION, 2023, 133
[3] RRNet: A Hybrid Detector for Object Detection in Drone-captured Images
Chen, Changrui
Zhang, Yu
Lv, Qingxuan
Wei, Shuo
Wang, Xiaorui
Sun, Xin
Dong, Junyu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 100 - 108
[4] Chen HT, 2023, Arxiv, DOI arXiv:2305.12972
[5] Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Chen, Jierun
Kao, Shiu-Hong
He, Hao
Zhuo, Weipeng
Wen, Song
Lee, Chul-Ho
Chan, S. -H. Gary
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12021 - 12031
[6] TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing
Chen, Jierun
He, Tianlang
Zhuo, Weipeng
Ma, Li
Ha, Sangtae
Chan, S-H Gary
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12538 - 12548
[7] Towards Large-Scale Small Object Detection: Survey and Benchmarks
Cheng, Gong
Yuan, Xiang
Yao, Xiwen
Yan, Kebing
Zeng, Qinghua
Xie, Xingxing
Han, Junwei
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13467 - 13488
[8] Xception: Deep Learning with Depthwise Separable Convolutions
Chollet, Francois
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
[9] RepVGG: Making VGG-style ConvNets Great Again
Ding, Xiaohan
Zhang, Xiangyu
Ma, Ningning
Han, Jungong
Ding, Guiguang
Sun, Jian
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13728 - 13737
[10] Everingham M., 2012, PASCAL VISUAL OBJECT

← 1 2 3 4 5 →