GDALR: Global Dual Attention and Local Representations in transformer for surface defect detection

被引:10
作者
Zhou, Xin [1 ]
Zhou, Shihua [1 ]
Zhang, Yongchao [1 ]
Ren, Zhaohui [1 ]
Jiang, Zeyu [1 ]
Luo, Hengfa [1 ]
机构
[1] Northeastern Univ, Sch Mech Engn & Automat, Wenhua Rd, Shenyang 110819, Liaoning, Peoples R China
关键词
Surface defect detection; Semantic segmentation; Vision transformer; Dual-attention; Local transformer;
D O I
10.1016/j.measurement.2024.114398
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Automated surface detection has gradually emerged as a promising and crucial inspection method in the industrial sector, greatly enhancing production quality and efficiency. However, current semantic network models based on Vision Transformers are primarily trained on natural images, which exhibit complex object textures and backgrounds. Additionally, pure Vision Transformers lack the ability to capture local representations, making it challenging to directly apply existing semantic segmentation models to industrial production scenarios. In this paper, we propose a novel transformer segmentation model specifically designed for surface defect detection in industrial settings. Firstly, we employ a Dual -Attention Transformer (DAT) as the backbone of our model. This backbone replaces the generic 2D convolution block with a new self -attention block in the Spatial Reduction Attention module (SRA), enabling the establishment of a global view for each layer. Secondly, we enhance the collection of local information during decoding by initializing the relative position between query and key pixels. Finally, to strengthen the salient defect structure, we utilize Pixel Shuffle to rearrange the Ground Truth (GT) in order to guide the feature maps at each scale. Extensive experiments are conducted on three publicly industrial datasets, and evaluation results describe the outstanding performance of our network in surface defect detection.
引用
收藏
页数:10
相关论文
共 62 条
  • [51] Wang XIB, 2023, Arxiv, DOI [arXiv:2302.04521, 10.48550/arXiv.2302.04521, DOI 10.48550/ARXIV.2302.04521]
  • [52] Wei J, 2020, AAAI CONF ARTIF INTE, V34, P12321
  • [53] Wu ZH, 2020, Arxiv, DOI arXiv:2004.11886
  • [54] Attentive Boundary-Aware Fusion for Defect Semantic Segmentation Using Transformer
    Yeung, Ching-Chi
    Lam, Kin-Man
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [55] FDSNET: AN ACCURATE REAL-TIME SURFACE DEFECT SEGMENTATION NETWORK
    Zhang, Jian
    Ding, Runwei
    Ban, Miaoju
    Guo, Tianyu
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3803 - 3807
  • [56] Auto-MSFNet: Search Multi-scale Fusion Network for Salient Object Detection
    Zhang, Miao
    Liu, Tingwei
    Piao, Yongri
    Yao, Shunyu
    Lu, Huchuan
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 667 - 676
  • [57] Wavelet-Guided Promotion-Suppression Transformer for Surface-Defect Detection
    Zhang, Quan
    Lai, Jianhuang
    Zhu, Junyong
    Xie, Xiaohua
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4517 - 4528
  • [58] Semantic Segmentation by Early Region Proxy
    Zhang, Yifan
    Pang, Bo
    Lu, Cewu
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1248 - 1258
  • [59] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
    Zheng, Sixiao
    Lu, Jiachen
    Zhao, Hengshuang
    Zhu, Xiatian
    Luo, Zekun
    Wang, Yabiao
    Fu, Yanwei
    Feng, Jianfeng
    Xiang, Tao
    Torr, Philip H. S.
    Zhang, Li
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6877 - 6886
  • [60] ETDNet: Efficient Transformer-Based Detection Network for Surface Defect Detection
    Zhou, Hantao
    Yang, Rui
    Hu, Runze
    Shu, Chang
    Tang, Xiaochu
    Li, Xiu
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72