Modified Object Detection Method Based on YOLO

被引：8

作者：

Zhao, Xia ^{[1
]}

Ni, Yingting ^{[1
]}

Jia, Haihang ^{[1
]}

机构：

[1] Tongji Univ, Sch Elect & Informat Engn, Shanghai, Peoples R China

来源：

COMPUTER VISION, PT III | 2017年 / 773卷

关键词：

Deep learning; Object detection; Anchor box; Cluster center;

D O I：

10.1007/978-981-10-7305-2_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

YOLO (You Only Look Once), the 2D object detection method, is extremely fast since a single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. However, it makes more localization errors and its training velocity is relatively slow. Benefiting from the thoughts of cluster center in super-pixel segmentation and anchor box in Faster R-CNN, in this paper, we propose a modified method based on YOLO (shorted for M-YOLO). First, we substituted YOLOs last fully connected layer for a convolutional layer, on which the cluster boxes (some anchor boxes centered on cluster center) can completely cover the whole image at the beginning of training. As a result, the new structure can speed up the training process. Second, we increase the number of divided grids i.e. cluster centers, from 7 x 7 to the maximum 17 x 17, as well as the number of predicted bounding boxes, i.e. anchor boxes, from 2 to the maximum 9 for each grid cell. The measure can improve the IOU performance. Simultaneously, we also put forward a new kind of NMS (non-max suppression) to solve the problem aroused by M-YOLO. The experimental results show that M-YOLO improves the localization accuracy by about 10%, the convergence speed of the training process is also improved.

引用

页码：233 / 244

页数：12

共 6 条

[1] SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].

Achanta, Radhakrishna ;

Shaji, Appu ;

Smith, Kevin ;

Lucchi, Aurelien ;

Fua, Pascal ;

Suesstrunk, Sabine .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281

[2]

Girshick R., 2014, IEEE C COMP VIS PATT, DOI [DOI 10.1109/CVPR.2014.81, 10.1109/CVPR.2014.81]

[3] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[4] SSD: Single Shot MultiBox Detector [J].

Liu, Wei ;

Anguelov, Dragomir ;

Erhan, Dumitru ;

Szegedy, Christian ;

Reed, Scott ;

Fu, Cheng-Yang ;

Berg, Alexander C. .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37

[5]

REDMON J, 2016, PROC CVPR IEEE, P779, DOI [DOI 10.1109/CVPR.2016.91, 10.1109/CVPR.2016.91]

[6] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].

Ren, Shaoqing ;

He, Kaiming ;

Girshick, Ross ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149

← 1 →