Depth-Assisted Semi-Supervised RGB-D Rail Surface Defect Inspection

被引:3
作者
Wang, Jie [1 ]
Li, Guoqiang [2 ]
Qiu, Guanwen [3 ]
Ma, Gang [4 ]
Xi, Jinwen [5 ]
Yu, Nana [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300354, Peoples R China
[2] Chinese Acad Sci, Natl Sci Lib, Beijing 100045, Peoples R China
[3] Univ Penn, Sch Engn & Appl Sci, Philadelphia, PA 19104 USA
[4] BOE Technol Grp Co Ltd, Beijing 100176, Peoples R China
[5] Zhongguancun Lab, Beijing 100094, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Rails; Annotations; Inspection; Training; Surface morphology; Shape; Rail surface defect inspection; RGB-D images; semi-supervised learning; salient object detection; SALIENT OBJECT DETECTION; NETWORK;
D O I
10.1109/TITS.2024.3387949
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Visual-based methods for rail surface defect inspection (RSDI) effectively improve the limitations of manual inspection, as they can intuitively display the locations and segmented areas of sensitive defects. The RGB-D RSDI task, which leverages the complementarity between RGB and depth (D) image information to enhance detection performance, has attracted widespread attention and achieved significant development. However, existing methods primarily depend on fully supervised training strategies that necessitate a substantial number of manually annotated pixel-level labels to supervise model training. Undoubtedly, extensive manual annotation is exceedingly time-consuming and labor-intensive, particularly considering the irregular shapes and textures of surface defects on rails, further compounding the burden of manual labeling. Therefore, in this paper, we aim to introduce the semi-supervised learning paradigm into this task. Towards the semi-supervised RGB-D RSDI task, a specific semi-supervised network for this task and an effective cross-modal fusion module are crucial to ensuring detection performance under the constraints of limited labeled samples. Thus, we propose a Depth-assisted Semi-Supervised RGB-D RSDI network (DSSNet) to simultaneously alleviate the annotation burden and achieve satisfactory detection performance. Specifically, adhering to the consistency training paradigm, we construct a semi-supervised RGB-D RSDI architecture for this task by optimizing structures, perturbation mechanisms, loss settings, etc. Furthermore, we propose a Depth-assisted Multi-scale Cross-modal Fusion Module (DMCFM) that conducts multi-scale exploration and cross-modal complementary fusion with the assistance of depth. Comprehensive experiments demonstrate that, compared to the latest 14 state-of-the-art fully supervised methods, the proposed DSSNet achieves highly competitive results while effectively alleviating an 80 $\%$ annotation burden.
引用
收藏
页码:8042 / 8052
页数:11
相关论文
共 70 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
  • [3] Chen Y., EUR C COMPUT VIS ECC
  • [4] Structure-measure: A New Way to Evaluate Foreground Maps
    Fan, Deng-Ping
    Cheng, Ming-Ming
    Liu, Yun
    Li, Tao
    Borji, Ali
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4558 - 4567
  • [5] JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection
    Fu, Keren
    Fan, Deng-Ping
    Ji, Ge-Peng
    Zhao, Qijun
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3049 - 3059
  • [6] Gong C, 2018, ARXIV
  • [7] Huang L., IEEE T IND INFORM, V20, P4571
  • [8] Multi-Graph Fusion and Learning for RGBT Image Saliency Detection
    Huang, Liming
    Song, Kechen
    Wang, Jie
    Niu, Menghui
    Yan, Yunhui
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1366 - 1377
  • [9] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
    Jin, Wen-Da
    Xu, Jun
    Han, Qi
    Zhang, Yi
    Cheng, Ming-Ming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
  • [10] SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection
    Lee, Minhyeok
    Park, Chaewon
    Cho, Suhwan
    Lee, Sangyoun
    [J]. COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 630 - 647