MCS-YOLO: A Multiscale Object Detection Method for Autonomous Driving Road Environment Recognition

被引:27
作者
Cao, Yining [1 ]
Li, Chao [1 ]
Peng, Yakun [1 ]
Ru, Huiying [2 ]
机构
[1] Hebei Univ Architecture, Coll Informat Engn, Zhangjiakou 075000, Peoples R China
[2] Hebei Univ Architecture, Coll Sci, Zhangjiakou 075000, Peoples R China
关键词
Task analysis; Object detection; Feature extraction; Autonomous vehicles; Transformers; Road traffic; Detectors; Coordinate attention mechanisms; autonomous driving; road environmental object detection; swin transformer; YOLOv5;
D O I
10.1109/ACCESS.2023.3252021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection and recognition of road scenes are crucial tasks of the autonomous driving environmental perception system. The low inference speed and accuracy in object detection models hinder the development of autonomous driving technology. Searching for improvement of detection accuracy and speed is still a challenging task. For solving these problems, we proposed an MCS-YOLO algorithm. Firstly, a coordinate attention module is inserted into the backbone to aggregate the feature map's spatial coordinate and cross-channel information. Then, we designed a multiscale small object detection structure to improve the recognition sensitivity of dense small object. Finally, we applied the Swin Transformer structure to the CNN to enable the network to focus on contextual spatial information. Conducting ablation study on the autonomous driving dataset BDD100K, MCS-YOLO algorithm achieves a mean average precision of 53.6% and a recall rate of 48.3%, which are 4.3% and 3.9% better than the YOLOv5s algorithm respectively. In addition, it can achieve real-time detection speed of 55 frames per second in a real scene. The results show that the MCS-YOLO algorithm is effective and superior in the task of automatic driving object detection.
引用
收藏
页码:22342 / 22354
页数:13
相关论文
共 46 条
  • [1] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
  • [2] Chen Y., 2022, IEEE T INTELL TRANSP, V48, P1
  • [3] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [4] Feng Yang, 2021, 2021 International Conference on Control, Automation and Information Sciences (ICCAIS), P824, DOI 10.1109/ICCAIS52680.2021.9624546
  • [5] Fu C.-Y., 2017, arXiv
  • [6] Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, 10.48550/arXiv.2107.08430, DOI 10.48550/ARXIV.2107.08430]
  • [7] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7029 - 7038
  • [8] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [9] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [10] Gupta A., 2022, PROC IEEE 95 VEH TEC, P1