CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin Transformer With Block Level CBAM Enhancement

被引:0
作者
Zhao, Jiayi [1 ]
Yeung, Alison Wun-Lam [1 ]
Ali, Muhammad [1 ]
Lai, Songjiang [1 ]
Ng, Vincent To-Yee [1 ]
机构
[1] Ctr Adv Reliabil & Safety CAiRS, Hong Kong, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Rails; Feature extraction; Computer architecture; Defect detection; Attention mechanisms; YOLO; Mathematical models; Corrugated surfaces; Accuracy; Object detection; small rail surface defects; Swin Transformer; convolutional block attention module (CBAM); attention mechanism;
D O I
10.1109/ACCESS.2024.3509986
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Under high-intensity rail operations, rail tracks endure considerable stresses resulting in various defects such as corrugation and spellings. Failure to effectively detect defects and provide maintenance in time would compromise service reliability and public safety. While advanced models have been developed in recent years, efficiently identifying small-scale rail defects has not yet been studied, especially for categories such as Dirt or Squat on rail surface. To address this challenge, this study utilizes Swin Transformer (SwinT) as baseline and incorporates the Convolutional Block Attention Module (CBAM) for enhancement. Our proposed method integrates CBAM successively within the swin transformer blocks, resulting in significant performance improvement in rail defect detection, particularly for categories with small instance sizes. The proposed framework is named CBAM-Enhanced Swin Transformer in Block Level (CBAM-SwinT-BL). Experiment and ablation study have proven the effectiveness of the framework. The proposed framework has a notable improvement in the accuracy of small size defects, such as dirt and dent categories in RIII dataset, with mAP-50 increasing by +23.0% and +38.3% respectively, and the squat category in MUET dataset also reaches +13.2% higher than the original model. Compares to the original SwinT, CBAM-SwinT-BL increase overall precision around +5% in the MUET dataset and +7% in the RIII dataset, reaching 69.1% and 88.1% respectively. Meanwhile, the additional module CBAM merely extend the model training speed by an average of +0.04s/iteration, which is acceptable compared to the significant improvement in system performance.
引用
收藏
页码:181997 / 182009
页数:13
相关论文
共 50 条
  • [1] SLICING AIDED HYPER INFERENCE AND FINE-TUNING FOR SMALL OBJECT DETECTION
    Akyon, Fatih Cagatay
    Altinuc, Sinan Onur
    Temizel, Alptekin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 966 - 970
  • [2] Detection and Classification System for Rail Surface Defects Based on Eddy Current
    Alvarenga, Tiago A.
    Carvalho, Alexandre L.
    Honorio, Leonardo M.
    Cerqueira, Augusto S.
    Luciano Filho, M. A.
    Nobrega, Rafael A.
    [J]. SENSORS, 2021, 21 (23)
  • [3] Railway track surface faults dataset
    Arain, Asfar
    Mehran, Sanaullah
    Shaikh, Muhammad Zakir
    Kumar, Dileep
    Chowdhry, Bhawani Shankar
    Hussain, Tanweer
    [J]. DATA IN BRIEF, 2024, 52
  • [4] Resilience in railway transport systems: a literature review and research agenda
    Besinovic, Nikola
    [J]. TRANSPORT REVIEWS, 2020, 40 (04) : 457 - 478
  • [5] Bharati Puja, 2020, Computational Intelligence in Pattern Recognition. Proceedings of CIPR 2019. Advances in Intelligent Systems and Computing (AISC 999), P657, DOI 10.1007/978-981-13-9042-5_56
  • [6] Deep learning techniques to classify agricultural crops through UAV imagery: a review
    Bouguettaya, Abdelmalek
    Zarzour, Hafed
    Kechida, Ahmed
    Taberkit, Amine Mohammed
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (12) : 9511 - 9536
  • [7] Improved Swin Transformer-Based Semantic Segmentation of Postearthquake Dense Buildings in Urban Areas Using Remote Sensing Images
    Cui, Liangyi
    Jing, Xin
    Wang, Yu
    Huan, Yixuan
    Xu, Yang
    Zhang, Qiangqiang
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 369 - 385
  • [8] Extended Feature Pyramid Network for Small Object Detection
    Deng, Chunfang
    Wang, Mengmeng
    Liu, Liang
    Liu, Yong
    Jiang, Yunliang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1968 - 1979
  • [9] Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, DOI 10.48550/ARXIV.2107.08430]
  • [10] Attention mechanisms in computer vision: A survey
    Guo, Meng-Hao
    Xu, Tian-Xing
    Liu, Jiang-Jiang
    Liu, Zheng-Ning
    Jiang, Peng-Tao
    Mu, Tai-Jiang
    Zhang, Song-Hai
    Martin, Ralph R.
    Cheng, Ming-Ming
    Hu, Shi-Min
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) : 331 - 368