Road target detection in harsh environments based on improved YOLOv8n

被引:0
作者
Xu, Minjun [1 ,2 ,3 ]
Sun, Jiayu [1 ,2 ,3 ]
Zhang, Junpeng [1 ,2 ,3 ]
Yan, Mengxue [1 ,2 ,3 ]
Cao, Wen [1 ,2 ,3 ]
Hou, Alin [1 ,2 ,3 ]
机构
[1] Changchun University of Technology, School of Computer Science and Engineering, Changchun
[2] Jilin University, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Changchun
[3] Jilin Province New Generation Artificial Intelligence Smart Health Joint Innovation Laboratory, Changchun
关键词
non-strided convolution; polarimetric self-attention mechanism; target detection; up-sampling; YOLOv8n;
D O I
10.1117/1.JEI.33.5.053022
中图分类号
学科分类号
摘要
In harsh environments such as rain, snow, fog, and night, the road conditions are complex, and the existing road target detection algorithms are less studied, the amount of data is scarce, and the detection accuracy is not high. Aiming at the problems of less road target detection data and more interference factors in harsh environments, a road target detection method based on improved you only look once version 8n (YOLOv8n) in harsh environments is proposed. Based on the Berkeley DeepDrive-IW (BDD-IW) dataset, the improved cutout algorithm is used to simulate the occlusion of the targets to be detected and the uneven visibility of the detected images caused by the harsh environments in real scenarios to enhance the robustness. In YOLOv8n, a space-to-depth convolution (SPD-Conv) module consisting of a space-to-depth layer and a non-strided convolutional layer is added to the previous layer of each cross-stage partial Darknet53 to two-stage feature pyramid network (C2f) module of the backbone section to compensate for the loss of fine-grained image information caused by a strided convolution. On this basis, the C2f module and the parallel polarized self-attention (PPSA) mechanism are used to construct the feature extraction module cross-stage partial DarkNet53 and polarimetric self-attention to two-stage feature pyramid network and replace the C2f module in the backbone of YOLOv8n to reduce the information loss caused by more noise in harsh environments. To further improve the precision, the up-sampling part of YOLOv8n is modified to the content-aware reassembly of the feature operator to effectively aggregate the context information and expand the receptive field. The experimental results show that for BDD-IW data processed with improved cutout, compared with the original network, the improved YOLOv8n network has a small increase in the number of parameters and increases the precision by 6.0% and the mAP50 value by 5.0%, which effectively improves the performance of target detection in harsh environments. © 2024 SPIE and IS&T.
引用
收藏
相关论文
共 44 条
[1]  
Wang R., Et al., Real-time vehicle target detection in inclement weather conditions based on YOLOv4, Front. Neurorob, 17, (2023)
[2]  
He Y., Liu Z., A feature fusion method to improve the driving obstacle detection under foggy weather, IEEE Trans. Transp. Electrification, 7, 4, pp. 2505-2515, (2021)
[3]  
Girshick R., Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, IEEE Conf. Comput. Vis. Pattern Recognit, pp. 580-587, (2014)
[4]  
Girshick R., Fast R-CNN, IEEE Int. Conf. Comput. Vision, pp. 1440-1448, (2015)
[5]  
Ren S. Q., Et al., Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell, 39, 6, pp. 1137-1149, (2017)
[6]  
He K. M., Et al., Mask R-CNN, IEEE Int. Conf. Comput. Vis, pp. 2980-2988, (2017)
[7]  
REDMON J., Et al., You only look once: unified, real-time object detection, IEEE Conf. Comput. Vis. Pattern Recognit, pp. 779-788, (2016)
[8]  
REDMON J., FARHADI A., YOLOv3: an incremental improvement, (2018)
[9]  
BOCHKOVSKIY A., Et al., YOLOv4: optimal speed and accuracy of object detection, (2020)
[10]  
Jocher G., Et al.