MCS-YOLO: A Multiscale Object Detection Method for Autonomous Driving Road Environment Recognition

被引：39

作者：

Cao, Yining ^{[1
]}

Li, Chao ^{[1
]}

Peng, Yakun ^{[1
]}

Ru, Huiying ^{[2
]}

机构：

[1] Hebei Univ Architecture, Coll Informat Engn, Zhangjiakou 075000, Peoples R China

[2] Hebei Univ Architecture, Coll Sci, Zhangjiakou 075000, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Task analysis; Object detection; Feature extraction; Autonomous vehicles; Transformers; Road traffic; Detectors; Coordinate attention mechanisms; autonomous driving; road environmental object detection; swin transformer; YOLOv5;

D O I：

10.1109/ACCESS.2023.3252021

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Object detection and recognition of road scenes are crucial tasks of the autonomous driving environmental perception system. The low inference speed and accuracy in object detection models hinder the development of autonomous driving technology. Searching for improvement of detection accuracy and speed is still a challenging task. For solving these problems, we proposed an MCS-YOLO algorithm. Firstly, a coordinate attention module is inserted into the backbone to aggregate the feature map's spatial coordinate and cross-channel information. Then, we designed a multiscale small object detection structure to improve the recognition sensitivity of dense small object. Finally, we applied the Swin Transformer structure to the CNN to enable the network to focus on contextual spatial information. Conducting ablation study on the autonomous driving dataset BDD100K, MCS-YOLO algorithm achieves a mean average precision of 53.6% and a recall rate of 48.3%, which are 4.3% and 3.9% better than the YOLOv5s algorithm respectively. In addition, it can achieve real-time detection speed of 55 frames per second in a real scene. The results show that the MCS-YOLO algorithm is effective and superior in the task of automatic driving object detection.

引用

页码：22342 / 22354

页数：13

共 46 条

[11]

He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[12] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916

[13] Coordinate Attention for Efficient Mobile Network Design [J].

Hou, Qibin ;

Zhou, Daquan ;

Feng, Jiashi .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717

[14]

Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]

[15]

Huibai Wang, 2022, 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC), P1751, DOI 10.1109/ITOEC53115.2022.9734559

[16] Computer Vision for Autonomous Vehicles [J].

Janai, Joel ;

Guney, Fatma ;

Behl, Aseem ;

Geiger, Andreas .

FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2020, 12 (1-3) :1-308

[17]

Kuderer M, 2015, IEEE INT CONF ROBOT, P2641, DOI 10.1109/ICRA.2015.7139555

[18]

Levinson J, 2011, IEEE INT VEH SYM, P163, DOI 10.1109/IVS.2011.5940562

[19]

Li CY, 2022, Arxiv, DOI [arXiv:2209.02976, 10.48550/arXiv.2209.02976]

[20] H-YoLov3: High performance object detection applied to assisted driving [J].

Li, Liang ;

Li, Xiaoli .

2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, :462-467

← 1 2 3 4 5 →