A novel hybrid attention gate based on vision transformer for the detection of surface defects

被引：2

作者：

Uzen, Hueseyin ^{[1
]}

Turkoglu, Muammer ^{[2
]}

Ozturk, Dursun ^{[3
]}

Hanbay, Davut ^{[4
]}

机构：

[1] Bingol Univ, Dept Comp Engn, TR-12000 Bingol, Turkiye

[2] Samsun Univ, Dept Software Engn, TR-55000 Samsun, Turkiye

[3] Bingol Univ, Dept Elect & Elect Engn, TR-12000 Bingol, Turkiye

[4] Inonu Univ, Dept Comp Engn, TR-44000 Malatya, Turkiye

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 10期

关键词：

Defects detection; Vision transformers; Squeeze and excitation; Encoder decoder network; Convolutional neural network; CONVOLUTIONAL NEURAL-NETWORK;

D O I：

10.1007/s11760-024-03355-2

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Many advanced models have been proposed for automatic surface defect inspection. Although CNN-based methods have achieved superior performance among these models, it is limited to extracting global semantic details due to the locality of the convolution operation. In addition, global semantic details can achieve high success for detecting surface defects. Recently, inspired by the success of Transformer, which has powerful abilities to model global semantic details with global self-attention mechanisms, some researchers have started to apply Transformer-based methods in many computer-vision challenges. However, as many researchers notice, transformers lose spatial details while extracting semantic features. To alleviate these problems, in this paper, a transformer-based Hybrid Attention Gate (HAG) model is proposed to extract both global semantic features and spatial features. The HAG model consists of Transformer (Trans), channel Squeeze-spatial Excitation (sSE), and merge process. The Trans model extracts global semantic features and the sSE extracts spatial features. The merge process which consists of different versions such as concat, add, max, and mul allows these two different models to be combined effectively. Finally, four versions based on HAG-Feature Fusion Network (HAG-FFN) were developed using the proposed HAG model for the detection of surface defects. The four different datasets were used to test the performance of the proposed HAG-FFN versions. In the experimental studies, the proposed model produced 83.83%, 79.34%, 76.53%, and 81.78% mIoU scores for MT, MVTec-Texture, DAGM, and AITEX datasets. These results show that the proposed HAGmax-FFN model provided better performance than the state-of-the-art models.

引用

页码：6835 / 6851

页数：17

共 61 条

[1] RETRACTED: Localization and segmentation of metal cracks using deep learning (Retracted Article) [J].

Aslam, Yasir ;

Santhi, N. ;

Ramasamy, N. ;

Ramar, K. .

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (03) :4205-4213

[2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[3] Eff-UNet: A Novel Architecture for Semantic Segmentation in Unstructured Environment [J].

Baheti, Bhakti ;

Innani, Shubham ;

Gajre, Suhas ;

Talbar, Sanjay .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :1473-1481

[4] MVTec AD - A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection [J].

Bergmann, Paul ;

Fauser, Michael ;

Sattlegger, David ;

Steger, Carsten .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9584-9592

[5] Image-Based Surface Defect Detection Using Deep Learning: A Review [J].

Bhatt, Prahar M. ;

Malhan, Rishi K. ;

Rajendran, Pradeep ;

Shah, Brual C. ;

Thakar, Shantanu ;

Yoon, Yeo Jung ;

Gupta, Satyandra K. .

JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2021, 21 (04)

[6] Robust moving shadow detection with hierarchical mixture of MLP experts [J].

Boroujeni, Hamid Shayegh ;

Charkari, Nasrollah Moghadam .

SIGNAL IMAGE AND VIDEO PROCESSING, 2014, 8 (07) :1291-1305

[7] Large-Complex-Surface Defect Detection by Hybrid Gradient Threshold Segmentation and Image Registration [J].

Cao, Guangzhong ;

Ruan, Songbo ;

Peng, Yeping ;

Huang, Sudan ;

Kwok, Ngaiming .

IEEE ACCESS, 2018, 6 :36235-36246

[8]

Cao Hu, 2021, arXiv

[9] A Pixel-Level Segmentation Convolutional Neural Network Based on Deep Feature Fusion for Surface Defect Detection [J].

Cao, Jingang ;

Yang, Guotian ;

Yang, Xiyun .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70

[10] Autonomous Structural Visual Inspection Using Region-Based Deep Learning for Detecting Multiple Damage Types [J].

Cha, Young-Jin ;

Choi, Wooram ;

Suh, Gahyun ;

Mahmoudkhani, Sadegh ;

Buyukozturk, Oral .

COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2018, 33 (09) :731-747

← 1 2 3 4 5 6 7 →