GDALR: Global Dual Attention and Local Representations in transformer for surface defect detection

被引:10
作者
Zhou, Xin [1 ]
Zhou, Shihua [1 ]
Zhang, Yongchao [1 ]
Ren, Zhaohui [1 ]
Jiang, Zeyu [1 ]
Luo, Hengfa [1 ]
机构
[1] Northeastern Univ, Sch Mech Engn & Automat, Wenhua Rd, Shenyang 110819, Liaoning, Peoples R China
关键词
Surface defect detection; Semantic segmentation; Vision transformer; Dual-attention; Local transformer;
D O I
10.1016/j.measurement.2024.114398
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Automated surface detection has gradually emerged as a promising and crucial inspection method in the industrial sector, greatly enhancing production quality and efficiency. However, current semantic network models based on Vision Transformers are primarily trained on natural images, which exhibit complex object textures and backgrounds. Additionally, pure Vision Transformers lack the ability to capture local representations, making it challenging to directly apply existing semantic segmentation models to industrial production scenarios. In this paper, we propose a novel transformer segmentation model specifically designed for surface defect detection in industrial settings. Firstly, we employ a Dual -Attention Transformer (DAT) as the backbone of our model. This backbone replaces the generic 2D convolution block with a new self -attention block in the Spatial Reduction Attention module (SRA), enabling the establishment of a global view for each layer. Secondly, we enhance the collection of local information during decoding by initializing the relative position between query and key pixels. Finally, to strengthen the salient defect structure, we utilize Pixel Shuffle to rearrange the Ground Truth (GT) in order to guide the feature maps at each scale. Extensive experiments are conducted on three publicly industrial datasets, and evaluation results describe the outstanding performance of our network in surface defect detection.
引用
收藏
页数:10
相关论文
共 62 条
  • [1] [Anonymous], SEVERSTAL STEEL DEFE
  • [2] Ba L. J., 2016, arXiv
  • [3] Mixed supervision for surface-defect detection: From weakly to fully supervised learning
    Bozic, Jakob
    Tabernik, Domen
    Skocaj, Danijel
    [J]. COMPUTERS IN INDUSTRY, 2021, 129
  • [4] Albumentations: Fast and Flexible Image Augmentations
    Buslaev, Alexander
    Iglovikov, Vladimir I.
    Khvedchenya, Eugene
    Parinov, Alex
    Druzhinin, Mikhail
    Kalinin, Alexandr A.
    [J]. INFORMATION, 2020, 11 (02)
  • [5] C. MMSegmentation, 2020, Mmsegmentation: Openmmlab semantic segmentation toolbox and benchmark
  • [6] Carion N., 2020, LNCS, V12346, P213, DOI [DOI 10.1007/978-3-030-58452-813, 10.1007/978- 3- 030-58452-8 13, DOI 10.1007/978-3-030-58452-8_13]
  • [7] CycleMLP: A MLP-Like Architecture for Dense Visual Predictions
    Chen, Shoufa
    Xie, Enze
    Ge, Chongjian
    Chen, Runjian
    Liang, Ding
    Luo, Ping
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14284 - 14300
  • [8] RetinaNet With Difference Channel Attention and Adaptively Spatial Feature Fusion for Steel Surface Defect Detection
    Cheng, Xun
    Yu, Jianbo
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70 (70)
  • [9] Cordonnier JB, 2020, Arxiv, DOI arXiv:1911.03584
  • [10] ConViT: improving vision transformers with soft convolutional inductive biases
    d'Ascoli, Stephane
    Touvron, Hugo
    Leavitt, Matthew L.
    Morcos, Ari S.
    Biroli, Giulio
    Sagun, Levent
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11):