DCE-YOLOv8: Lightweight and Accurate Object Detection for Drone Vision

被引：0

作者：

An, Jinsu ^{[1
]}

Lee, Dong Hee ^{[1
]}

Putro, Muhamad Dwisnanto ^{[2
]}

Kim, Byeong Woo ^{[1
]}

机构：

[1] Univ Ulsan, Dept Elect Elect & Comp Engn, Ulsan 44610, South Korea

[2] Sam Ratulangi Univ, Dept Elect Engn, Manado 95115, Indonesia

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

新加坡国家研究基金会;

关键词：

Feature extraction; Drones; YOLO; Real-time systems; Neck; Computer vision; Head; Convolution; Cameras; Accuracy; Object detection; divided context extraction; lightweight; drone vision; YOLOv8;

D O I：

10.1109/ACCESS.2024.3481410

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Object detection using drones is a sophisticated technology that employs a camera mounted on a drone in conjunction with a computer vision algorithm to pinpoint the precise location of an object and ascertain its type. Drones are capable of rapidly scanning extensive areas, thereby facilitating efficient data collection and analysis. This capability can yield critical information and bolster swift response efforts. The utilization of object detection technology in drones offers numerous advantages. Nevertheless, despite the benefit of drones' ability to swiftly scan wide areas, several challenges persist, including image resolution, the detection of small-sized objects, overlapping objects, and concentrated distributions. In this paper, we introduce DCE-YOLOv8, an advanced model based on YOLOv8. DCE-YOLOv8 is engineered to address the low detection rate of small objects in drone imagery. To effectively detect small objects, it is imperative to either enhance the resolution of drone images or efficiently extract the features of small objects. Additionally, the efficient integration of these extracted features is crucial. The ERB(Efficient Residual Bottleneck) and DCE(Divided Context Extraction) modules are incorporated into the Backbone, with the ERB module reducing the number of parameters to render the model more lightweight. The DCE module focuses on extracting features pertinent to small objects. Subsequently, the rate of missed detections is mitigated by comprehensively merging the shallow and deep features extracted from the neck part. The proposed method is trained using the VisDrone, demonstrating superior detection performance compared to other state-of-the-art methods. When comparing the proposed method with the YOLOv8 small version using the VisDrone dataset, the mean Average Precision value improved by approximately 43%, increasing from 22.8mAP to 32.7mAP, while the number of parameters decreased by about 57%, from 11,166,560 to 4,822,382. The Average Inference Time per Image has been optimized to 11.4 ms, which is relatively slower than YOLOv8's 5.9 ms, yet it still maintains a robust frame rate of 87.71 FPS, emphasizing its potential for real-time detection applications. Furthermore, the proposed method underwent additional experiments using the TT100K and AFO datasets. Compared to YOLOv8 small, it demonstrates superior performance while maintaining a comparable average inference time. This paper holds significant value in balancing object detection rates and real-time operational speed, serving as a guiding reference for in-depth research in related fields.

引用

页码：170898 / 170912

页数：15

共 22 条

[1]

Everingham M., 2012, VOC2012 RESULTS

[2] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

[3]

Jocher G., 2023, Ultralytics YOLO8

[4]

Jocher Glenn, 2022, Zenodo, DOI 10.5281/ZENODO.3908559

[5]

Li CY, 2022, Arxiv, DOI arXiv:2209.02976

[6]

Li Xiang, 2020, Advances in Neural Information Processing Systems, V33

[7] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

[8] Image Inpainting for Irregular Holes Using Partial Convolutions [J].

Liu, Guilin ;

Reda, Fitsum A. ;

Shih, Kevin J. ;

Wang, Ting-Chun ;

Tao, Andrew ;

Catanzaro, Bryan .

COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :89-105

[9] SF-YOLOv5: A Lightweight Small Object Detection Algorithm Based on Improved Feature Fusion Mode [J].

Liu, Haiying ;

Sun, Fengqian ;

Gu, Jason ;

Deng, Lixia .

SENSORS, 2022, 22 (15)

[10] UCN-YOLOv5: Traffic Sign Object Detection Algorithm Based on Deep Learning [J].

Liu, Peilin ;

Xie, Zhaoyang ;

Li, Taijun .

IEEE ACCESS, 2023, 11 :110039-110050

← 1 2 3 →