Depth-Assisted Semi-Supervised RGB-D Rail Surface Defect Inspection

被引:7
作者
Wang, Jie [1 ]
Li, Guoqiang [2 ]
Qiu, Guanwen [3 ]
Ma, Gang [4 ]
Xi, Jinwen [5 ]
Yu, Nana [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300354, Peoples R China
[2] Chinese Acad Sci, Natl Sci Lib, Beijing 100045, Peoples R China
[3] Univ Penn, Sch Engn & Appl Sci, Philadelphia, PA 19104 USA
[4] BOE Technol Grp Co Ltd, Beijing 100176, Peoples R China
[5] Zhongguancun Lab, Beijing 100094, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Rails; Annotations; Inspection; Training; Surface morphology; Shape; Rail surface defect inspection; RGB-D images; semi-supervised learning; salient object detection; SALIENT OBJECT DETECTION; NETWORK;
D O I
10.1109/TITS.2024.3387949
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Visual-based methods for rail surface defect inspection (RSDI) effectively improve the limitations of manual inspection, as they can intuitively display the locations and segmented areas of sensitive defects. The RGB-D RSDI task, which leverages the complementarity between RGB and depth (D) image information to enhance detection performance, has attracted widespread attention and achieved significant development. However, existing methods primarily depend on fully supervised training strategies that necessitate a substantial number of manually annotated pixel-level labels to supervise model training. Undoubtedly, extensive manual annotation is exceedingly time-consuming and labor-intensive, particularly considering the irregular shapes and textures of surface defects on rails, further compounding the burden of manual labeling. Therefore, in this paper, we aim to introduce the semi-supervised learning paradigm into this task. Towards the semi-supervised RGB-D RSDI task, a specific semi-supervised network for this task and an effective cross-modal fusion module are crucial to ensuring detection performance under the constraints of limited labeled samples. Thus, we propose a Depth-assisted Semi-Supervised RGB-D RSDI network (DSSNet) to simultaneously alleviate the annotation burden and achieve satisfactory detection performance. Specifically, adhering to the consistency training paradigm, we construct a semi-supervised RGB-D RSDI architecture for this task by optimizing structures, perturbation mechanisms, loss settings, etc. Furthermore, we propose a Depth-assisted Multi-scale Cross-modal Fusion Module (DMCFM) that conducts multi-scale exploration and cross-modal complementary fusion with the assistance of depth. Comprehensive experiments demonstrate that, compared to the latest 14 state-of-the-art fully supervised methods, the proposed DSSNet achieves highly competitive results while effectively alleviating an 80 $\%$ annotation burden.
引用
收藏
页码:8042 / 8052
页数:11
相关论文
共 70 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]  
Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
[3]  
Chen Y., EUR C COMPUT VIS ECC
[4]  
Fan D.D., 2018, arXiv
[5]   Structure-measure: A New Way to Evaluate Foreground Maps [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Yun ;
Li, Tao ;
Borji, Ali .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567
[6]   JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection [J].
Fu, Keren ;
Fan, Deng-Ping ;
Ji, Ge-Peng ;
Zhao, Qijun .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :3049-3059
[7]  
Huang L., IEEE T IND INFORM, V20, P4571
[8]   Multi-Graph Fusion and Learning for RGBT Image Saliency Detection [J].
Huang, Liming ;
Song, Kechen ;
Wang, Jie ;
Niu, Menghui ;
Yan, Yunhui .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) :1366-1377
[9]   Accurate RGB-D Salient Object Detection via Collaborative Learning [J].
Ji, Wei ;
Li, Jingjing ;
Zhang, Miao ;
Piao, Yongri ;
Lu, Huchuan .
COMPUTER VISION - ECCV 2020, PT XVIII, 2020, 12363 :52-69
[10]   CDNet: Complementary Depth Network for RGB-D Salient Object Detection [J].
Jin, Wen-Da ;
Xu, Jun ;
Han, Qi ;
Zhang, Yi ;
Cheng, Ming-Ming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3376-3390