Evaluation of Robust Spatial Pyramid Pooling Based on Convolutional Neural Network for Traffic Sign Recognition System

被引:42
作者
Dewi, Christine [1 ,2 ]
Chen, Rung-Ching [1 ]
Tai, Shao-Kuo [1 ]
机构
[1] Chaoyang Univ Technol, Dept Informat Management, Taichung 41349, Taiwan
[2] Satya Wacana Christian Univ, Fac Informat Technol, Central Java 50711, Indonesia
关键词
spatial pyramid pooling; Yolo V3; object recognition; convolutional neural network; DEEP; VIDEO;
D O I
10.3390/electronics9060889
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traffic sign recognition (TSR) is a noteworthy issue for real-world applications such as systems for autonomous driving as it has the main role in guiding the driver. This paper focuses on Taiwan's prohibitory sign due to the lack of a database or research system for Taiwan's traffic sign recognition. This paper investigates the state-of-the-art of various object detection systems (Yolo V3, Resnet 50, Densenet, and Tiny Yolo V3) combined with spatial pyramid pooling (SPP). We adopt the concept of SPP to improve the backbone network of Yolo V3, Resnet 50, Densenet, and Tiny Yolo V3 for building feature extraction. Furthermore, we use a spatial pyramid pooling to study multi-scale object features thoroughly. The observation and evaluation of certain models include vital metrics measurements, such as the mean average precision (mAP), workspace size, detection time, intersection over union (IoU), and the number of billion floating-point operations (BFLOPS). Our findings show that Yolo V3 SPP strikes the best total BFLOPS (65.69), and mAP (98.88%). Besides, the highest average accuracy is Yolo V3 SPP at 99%, followed by Densenet SPP at 87%, Resnet 50 SPP at 70%, and Tiny Yolo V3 SPP at 50%. Hence, SPP can improve the performance of all models in the experiment.
引用
收藏
页数:21
相关论文
共 63 条
[41]  
Sermanet P, 2013, OVERFEAT INTEGRATED
[42]   An attribution-based pruning method for real-time mango detection with YOLO network [J].
Shi, Rui ;
Li, Tianxing ;
Yamaguchi, Yasushi .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 169
[43]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[44]   Video Google: A text retrieval approach to object matching in videos [J].
Sivic, J ;
Zisserman, A .
NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, :1470-+
[45]  
Szegedy C, 2014, Arxiv, DOI [arXiv:1312.6199, DOI 10.1109/CVPR.2015.7298594]
[46]   Apple detection during different growth stages in orchards using the improved YOLO-V3 model [J].
Tian, Yunong ;
Yang, Guodong ;
Wang, Zhe ;
Wang, Hao ;
Li, En ;
Liang, Zize .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 157 :417-426
[47]  
van de Sande KEA, 2011, IEEE I CONF COMP VIS, P1879, DOI 10.1109/ICCV.2011.6126456
[48]  
Wang C.Y., 2017, NEUROCOMPUTING, V2514, P1
[49]  
Wang J., 2010, CVPR, DOI DOI 10.1109/CVPR.2010.5540018
[50]  
Wu F, 2019, IEEE INT C NETW SENS, P363, DOI [10.1109/icnsc.2019.8743246, 10.1109/ICNSC.2019.8743246]