Hierarchical Cross-Modal multianchor distillation for rail surface defect detection

被引:1
作者
Wang, Bingying [1 ]
Qiang, Fangfang [1 ]
Zhou, Wujie [1 ,2 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore
基金
中国国家自然科学基金;
关键词
Transformer; Graph convolution network; Knowledge distillation; Cross-modal and cross-stage fusion; Rail defect inspection; SALIENT OBJECT DETECTION;
D O I
10.1016/j.measurement.2025.117600
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
For red-green-blue with depth (RGB-D) rail surface defect detection (RSDD), most models focus on improving accuracy, while ignoring computational costs. To address this gap, we introduce a hierarchical cross-modal multi-anchor distillation network (HCMD) for RSDD. Moreover, we propose an innovative defect detection approach, comprising three key stages: merge, split, recombine. During feature extraction, multi-layer features are divided into high-and low-level guidance features using a cross-modal cross-stage fusion module. Subsequently, the learned semantic feature is used to guide the low-level feature fusion using a multiscale graph convolution. Finally, to reduce the computational effort, hierarchical cross-modal multi-anchor distillation loss is used to transfer the teacher's knowledge to the student, thus achieving model compression. Extensive experiments using the NEU RSDDS-AUG benchmark RGB-D dataset demonstrate our HCMD's superiority in terms of reduced computational complexity and high accuracy.
引用
收藏
页数:14
相关论文
共 43 条
[1]   Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond [J].
Chen, Hao ;
Shen, Feihong ;
Ding, Ding ;
Deng, Yongjian ;
Li, Chao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 :1699-1709
[2]  
Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
[3]   CUFuse: Camera and Ultrasound Data Fusion for Rail Defect Detection [J].
Chen, Zhengxing ;
Wang, Qihang ;
He, Qing ;
Yu, Tianle ;
Zhang, Min ;
Wang, Ping .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) :21971-21983
[4]   CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection [J].
Cong, Runmin ;
Lin, Qinwei ;
Zhang, Chen ;
Li, Chongyi ;
Cao, Xiaochun ;
Huang, Qingming ;
Zhao, Yao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6800-6815
[5]   BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network [J].
Fan, Deng-Ping ;
Zhai, Yingjie ;
Borji, Ali ;
Yang, Jufeng ;
Shao, Ling .
COMPUTER VISION - ECCV 2020, PT XII, 2020, 12357 :275-292
[6]   Local Background Enclosure for RGB-D Salient Object Detection [J].
Feng, David ;
Barnes, Nick ;
You, Shaodi ;
McCarthy, Chris .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2343-2350
[7]   Double Similarity Distillation for Semantic Image Segmentation [J].
Feng, Yingchao ;
Sun, Xian ;
Diao, Wenhui ;
Li, Jihao ;
Gao, Xin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :5363-5376
[8]   Hierarchical Multi-Attention Transfer for Knowledge Distillation [J].
Gou, Jianping ;
Sun, Liyuan ;
Yu, Baosheng ;
Wan, Shaohua ;
Tao, Dacheng .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (02)
[9]  
Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, 10.48550/arXiv.1503.02531, DOI 10.48550/ARXIV.1503.02531]
[10]   Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection [J].
Hu, Xihang ;
Sun, Fuming ;
Sun, Jing ;
Wang, Fasheng ;
Li, Haojie .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) :3067-3085