HV-YOLOv8 by HDPconv: Better lightweight detectors for small object detection

被引:1
作者
Wang, Wei [1 ,2 ,3 ]
Meng, Yuanze [1 ,2 ]
Li, Shun [1 ,2 ]
Zhang, Chenghong [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenyang Inst Comp Technol, Donghu St, Shenyang 110168, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110169, Peoples R China
关键词
HDPConv; HV-YOLOv8; Small object detection; Lightweighting;
D O I
10.1016/j.imavis.2024.105052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurately identifying and localising small objects within images or videos is a critical challenge in the field of computer vision. It is mostly applied in scenarios that require high real-time performance, such as pedestrian detection and autonomous driving scenarios. These tiny targets generally include small objects at long distances, or objects appearing in low-resolution images, due to which it becomes exceptionally difficult to extract effective feature information. Since YOLOv8 with its large downsampling multiplier leads to deeper feature maps that make it difficult to detect tiny objects, we find that the use of residual structures in the convolution module can enhance the accuracy of small object detection. However, this undoubtedly increases the computational cost, so we lightened the convolution module to make it more suitable for practical applications and named it Halved Deep Pointwise Convolution (HDPConv). A cross-level partial module Variety of View Group Shuffle Cross Stage Partial Network (VOV-GSCSP) is also utilised, using a rational architecture as well as multi-scale information fusion, to ensure that the overall model is lightweight while obtaining rich gradient flows. On this basis, we propose a new network lightweight model HV-YOLOv8. In multiple sets of comparative experiments on two datasets (containing several state-of-the-art solutions as well as classical ones), we demonstrate the superiority of HV-YOLOv8, in particular, the accuracy is improved by 1.4% compared to YOLOv8, while the number of parameters and the amount of computation are drastically reduced.
引用
收藏
页数:10
相关论文
共 46 条
  • [1] Bhartiya K., 2019, Swimming Pool and Car Detection
  • [2] A full data augmentation pipeline for small object detection based on generative adversarial networks
    Bosquet, Brais
    Cores, Daniel
    Seidenari, Lorenzo
    Brea, Victor M.
    Mucientes, Manuel
    Del Bimbo, Alberto
    [J]. PATTERN RECOGNITION, 2023, 133
  • [3] RRNet: A Hybrid Detector for Object Detection in Drone-captured Images
    Chen, Changrui
    Zhang, Yu
    Lv, Qingxuan
    Wei, Shuo
    Wang, Xiaorui
    Sun, Xin
    Dong, Junyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 100 - 108
  • [4] Chen HT, 2023, Arxiv, DOI arXiv:2305.12972
  • [5] Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
    Chen, Jierun
    Kao, Shiu-Hong
    He, Hao
    Zhuo, Weipeng
    Wen, Song
    Lee, Chul-Ho
    Chan, S. -H. Gary
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12021 - 12031
  • [6] TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing
    Chen, Jierun
    He, Tianlang
    Zhuo, Weipeng
    Ma, Li
    Ha, Sangtae
    Chan, S-H Gary
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12538 - 12548
  • [7] Towards Large-Scale Small Object Detection: Survey and Benchmarks
    Cheng, Gong
    Yuan, Xiang
    Yao, Xiwen
    Yan, Kebing
    Zeng, Qinghua
    Xie, Xingxing
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13467 - 13488
  • [8] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [9] RepVGG: Making VGG-style ConvNets Great Again
    Ding, Xiaohan
    Zhang, Xiangyu
    Ma, Ningning
    Han, Jungong
    Ding, Guiguang
    Sun, Jian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13728 - 13737
  • [10] Everingham M., 2012, PASCAL VISUAL OBJECT