YOLOv4-5D: An Effective and Efficient Object Detector for Autonomous Driving

被引:225
作者
Cai, Yingfeng [1 ]
Luan, Tianyu [2 ]
Gao, Hongbo [3 ]
Wang, Hai [2 ]
Chen, Long [1 ]
Li, Yicheng [1 ]
Sotelo, Miguel Angel [4 ]
Li, Zhixiong [5 ]
机构
[1] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Univ, Sch Automot & Traff Engn, Zhenjiang 212013, Jiangsu, Peoples R China
[3] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
[4] Univ Alcal, Dept Comp Engn, Madrid 28801, Spain
[5] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
基金
中国国家自然科学基金;
关键词
Autonomous driving; CSPDarknet53; object detection; Pruning; YOLOv4;
D O I
10.1109/TIM.2021.3065438
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The use of object detection algorithms has become extremely important in autonomous vehicles. Object detection at high accuracy and a fast inference speed is essential for safe autonomous driving. Therefore, the balance between the effectiveness and efficiency of the object detector must be considered. This article proposes a one-stage object detection framework for improving the detection accuracy while supporting a true real-time operation based on the YOLOv4. The backbone network in the proposed framework is the CSPDarknet53_dcn(P). The last output layer in the CSPDarknet53 is replaced with deformable convolution to improve the detection accuracy. In order to perform feature fusion, a new feature fusion module PAN++ is designed and five scales detection layers are used to improve the detection accuracy of small objects. In addition, this article proposes an optimized network pruning algorithm to solve the problem that the real-time performance of the algorithm cannot be satisfied due to the limited computing resources of the vehicle-mounted computing platform. The method of sparse scaling factor is used to improve the existing channel pruning algorithm. Compared to the YOLOv4, the YOLOV4-5D improves the mean average precision by 4.23% on the BDD data sets and 1.68% on the KITTI data sets. Finally, by pruning the model, the inference speed of YOLOV4-5D is increased 31.3% and the memory is only 98.1 MB when the detection accuracy is almost unchanged. Nevertheless, the proposed algorithm is capable of real-time detection at faster than 66 frames/s (fps) and shows higher accuracy than the previous approaches with a similar fps.
引用
收藏
页数:13
相关论文
共 43 条
  • [11] He K, P IEEE C COMP VIS PA, P770, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]
  • [12] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
  • [13] SINet: A Scale-Insensitive Convolutional Neural Network for Fast Vehicle Detection
    Hu, Xiaowei
    Xu, Xuemiao
    Xiao, Yongjie
    Chen, Hao
    He, Shengfeng
    Qin, Jing
    Heng, Pheng-Ann
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (03) : 1010 - 1019
  • [14] Learning to Prune Filters in Convolutional Neural Networks
    Huang, Qiangui
    Zhou, Kevin
    You, Suya
    Neumann, Ulrich
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 709 - 718
  • [15] Data clustering: 50 years beyond K-means
    Jain, Anil K.
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (08) : 651 - 666
  • [16] Jocher G., 2021, ULTRALYTICS YOLOV5 V, DOI [10.5281/zenodo.4418161, DOI 10.5281/ZENODO.4418161]
  • [17] Joseph RK, 2016, CRIT POL ECON S ASIA, P1
  • [18] Multispectral Stereoscopic Imaging Device: Simultaneous Multiview Imaging from the Visible to the Near-Infrared
    Kazemzadeh, Farnoud
    Haider, Shahid A.
    Scharfenberger, Christian
    Wong, Alexander
    Clausi, David A.
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2014, 63 (07) : 1871 - 1873
  • [19] Vision-Based Measurement for Localization of Objects in 3-D for Robotic Applications
    Lins, Romulo Goncalves
    Givigi, Sidney N.
    Gardel Kurka, Paulo Roberto
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2015, 64 (11) : 2950 - 2958
  • [20] Path Aggregation Network for Instance Segmentation
    Liu, Shu
    Qi, Lu
    Qin, Haifang
    Shi, Jianping
    Jia, Jiaya
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8759 - 8768