STrans-YOLOX: Fusing Swin Transformer and YOLOX for Automatic Pavement Crack Detection

被引:20
作者
Luo, Hui [1 ]
Li, Jiamin [1 ]
Cai, Lianming [1 ]
Wu, Mingquan [1 ]
机构
[1] East China Jiaotong Univ, Sch Informat Engn, Nanchang 330013, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 03期
基金
中国国家自然科学基金;
关键词
pavement crack detection; object detection; Swin Transformer; YOLOX; global guidance attention; multi-scale feature fusion; NMS; complex scenes;
D O I
10.3390/app13031999
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Automatic pavement crack detection is crucial for reducing road maintenance costs and ensuring transportation safety. Although convolutional neural networks (CNNs) have been widely used in automatic pavement crack detection, they cannot adequately model the long-range dependencies between pixels and easily lose edge detail information in complex scenes. Moreover, irregular crack shapes also make the detection task challenging. To address these issues, an automatic pavement crack detection architecture named STrans-YOLOX is proposed. Specifically, the architecture first exploits the CNN backbone to extract feature information, preserving the local modeling ability of the CNN. Then, Swin Transformer is introduced to enhance the long-range dependencies through a self-attention mechanism by supplying each pixel with global features. A new global attention guidance module (GAGM) is used to ensure effective information propagation in the feature pyramid network (FPN) by using high-level semantic information to guide the low-level spatial information, thereby enhancing the multi-class and multi-scale features of cracks. During the post-processing stage, we utilize alpha-IoU-NMS to achieve the accurate suppression of the detection boxes in the case of occlusion and overlapping objects by introducing an adjustable power parameter. The experiments demonstrate that the proposed STrans-YOLOX achieves 63.37% mAP and surpasses the state-of-the-art models on the challenging pavement crack dataset.
引用
收藏
页数:17
相关论文
共 35 条
[1]   RDD2020: An annotated image dataset for automatic road damage detection using deep learning [J].
Arya, Deeksha ;
Maeda, Hiroya ;
Ghosh, Sanjay Kumar ;
Toshniwal, Durga ;
Sekimoto, Yoshihide .
DATA IN BRIEF, 2021, 36
[2]  
Beal J., 2020, arXiv
[3]  
Bochkovskiy A., 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection, Vabs/2004.10934, P1
[4]   GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [J].
Cao, Yue ;
Xu, Jiarui ;
Lin, Stephen ;
Wei, Fangyun ;
Hu, Han .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1971-1980
[5]  
Dosovitskiy A., 2021, arXiv
[6]   Pavement distress detection and classification based on YOLO network [J].
Du, Yuchuan ;
Pan, Ning ;
Xu, Zihao ;
Deng, Fuwen ;
Shen, Yu ;
Kang, Hua .
INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2021, 22 (13) :1659-1672
[7]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[8]  
Ge Z., 2021, ARXIV, DOI 10.48550/ARXIV.2107.08430
[9]   Vision-Based Crack Detection of Asphalt Pavement Using Deep Convolutional Neural Network [J].
Han, Zheng ;
Chen, Hongxu ;
Liu, Yiqing ;
Li, Yange ;
Du, Yingfei ;
Zhang, Hong .
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF CIVIL ENGINEERING, 2021, 45 (03) :2047-2055
[10]  
He J, 2021, ARXIV