Surface defect detection and classification of steel using an efficient Swin Transformer

被引:71
作者
Zhu, Wei [1 ]
Zhang, Hui [1 ]
Zhang, Chao [2 ]
Zhu, Xiaoyang [1 ]
Guan, Zhen [1 ]
Jia, Jiale [1 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Mech Engn, Zhenjiang 212003, Peoples R China
[2] Tianjin Univ Technol, Sch Comp Sci & Technol, Tianjin, Peoples R China
关键词
Deep learning; Steel defect detection; Improved Swin Transformer; Transfer learning;
D O I
10.1016/j.aei.2023.102061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting steel-surface defects is a crucial phase in steel manufacturing; however, accurately completing the detection task is challenging. The Swin Transformer, a self-attention-based model, has shown strong performance in the field of computer vision to enhance the adaptability of the Swin Transformer to the task of steel-surface defect detection, and a new network architecture called the LSwin Transformer is proposed in this study. First, in the downsampling process, we propose a convolutional embedding module and an attention patch merging module, which simultaneously strengthen the connections between the feature map channels, reduce the resolution, and increase image information retention. Second, we propose an effective window shift strategy and a convenient computation approach to make a complete defect between patches have more opportunity to obtain interactive computing. Finally, to combine the feature extraction capability of convolutional neural networks with the global dependency building capability of the Swin Transformer, we propose a depth multilayer perceptron module. Numerous experiments were conducted on a steel-surface defect dataset. The results demonstrated that the detection effect of our model outperformed competing methods, with a mean average precision of 81.2 %. In the ablation study, we verified the effectiveness of each module and initialized the parameters of the model through transfer learning to accelerate the convergence of the model. Therefore, the proposed LSwin Transformer has significant potential for detecting steel-surface defects.
引用
收藏
页数:13
相关论文
共 42 条
[1]  
Ba J. L., 2016, CoRR
[2]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3]   Mixed supervision for surface-defect detection: From weakly to fully supervised learning [J].
Bozic, Jakob ;
Tabernik, Domen ;
Skocaj, Danijel .
COMPUTERS IN INDUSTRY, 2021, 129
[4]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[5]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[6]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[8]  
Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
[9]  
Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, 10.48550/arXiv.2107.08430, DOI 10.48550/ARXIV.2107.08430]
[10]  
Georgieva K., 2015, Computing in Civil Engineering 2015. International Workshop on Computing in Civil Engineering. Proceedings, P99