Object Detection Algorithm Based on Improved YOLOv3

被引：183

作者：

Zhao, Liquan ^{[1
]}

Li, Shuaiyang ^{[1
]}

机构：

[1] Northeast Elect Power Univ, Minist Educ, Key Lab Modern Power Syst Simulat & Control & Ren, Jilin 132012, Jilin, Peoples R China

来源：

ELECTRONICS | 2020年 / 9卷 / 03期

基金：

中国国家自然科学基金;

关键词：

deep learning; object detection; YOLOv3; method; k-means cluster; DESIGN;

D O I：

10.3390/electronics9030537

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The 'You Only Look Once' v3 (YOLOv3) method is among the most widely used deep learning-based object detection methods. It uses the k-means cluster method to estimate the initial width and height of the predicted bounding boxes. With this method, the estimated width and height are sensitive to the initial cluster centers, and the processing of large-scale datasets is time-consuming. In order to address these problems, a new cluster method for estimating the initial width and height of the predicted bounding boxes has been developed. Firstly, it randomly selects a couple of width and height values as one initial cluster center separate from the width and height of the ground truth boxes. Secondly, it constructs Markov chains based on the selected initial cluster and uses the final points of every Markov chain as the other initial centers. In the construction of Markov chains, the intersection-over-union method is used to compute the distance between the selected initial clusters and each candidate point, instead of the square root method. Finally, this method can be used to continually update the cluster center with each new set of width and height values, which are only a part of the data selected from the datasets. Our simulation results show that the new method has faster convergence speed for initializing the width and height of the predicted bounding boxes and that it can select more representative initial widths and heights of the predicted bounding boxes. Our proposed method achieves better performance than the YOLOv3 method in terms of recall, mean average precision, and F1-score.

引用

页数：11

共 24 条

[1]

[Anonymous], 2017, J. Comput. Vis. Imaging Syst, DOI DOI 10.15353/VSNL.V3I1.171

[2]

[Anonymous], 2014, P IEEE C COMPUTER VI

[3]

[Anonymous], IEEE T PATTERN ANAL

[4]

[Anonymous], 2015, P IEEE INT C COMPUTE

[5]

[Anonymous], 2017, J.

[6]

[Anonymous], 2017, ARXIV

[7]

[Anonymous], IEEE T PATTERN ANAL

[8]

Bachem O, 2016, ADV NEUR IN, V29

[9] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[10]

Christ PF, 2017, I S BIOMED IMAGING, P839, DOI 10.1109/ISBI.2017.7950648

← 1 2 3 →