Hierarchical Cross-Modal multianchor distillation for rail surface defect detection

被引:1
作者
Wang, Bingying [1 ]
Qiang, Fangfang [1 ]
Zhou, Wujie [1 ,2 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore
基金
中国国家自然科学基金;
关键词
Transformer; Graph convolution network; Knowledge distillation; Cross-modal and cross-stage fusion; Rail defect inspection; SALIENT OBJECT DETECTION;
D O I
10.1016/j.measurement.2025.117600
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
For red-green-blue with depth (RGB-D) rail surface defect detection (RSDD), most models focus on improving accuracy, while ignoring computational costs. To address this gap, we introduce a hierarchical cross-modal multi-anchor distillation network (HCMD) for RSDD. Moreover, we propose an innovative defect detection approach, comprising three key stages: merge, split, recombine. During feature extraction, multi-layer features are divided into high-and low-level guidance features using a cross-modal cross-stage fusion module. Subsequently, the learned semantic feature is used to guide the low-level feature fusion using a multiscale graph convolution. Finally, to reduce the computational effort, hierarchical cross-modal multi-anchor distillation loss is used to transfer the teacher's knowledge to the student, thus achieving model compression. Extensive experiments using the NEU RSDDS-AUG benchmark RGB-D dataset demonstrate our HCMD's superiority in terms of reduced computational complexity and high accuracy.
引用
收藏
页数:14
相关论文
共 43 条
[11]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[12]  
Jiang XR, 2022, Arxiv, DOI arXiv:2207.03558
[13]  
Ju R, 2014, IEEE IMAGE PROC, P1115, DOI 10.1109/ICIP.2014.7025222
[14]   Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection [J].
Li, Gongyang ;
Liu, Zhi ;
Chen, Minyu ;
Bai, Zhen ;
Lin, Weisi ;
Ling, Haibin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3528-3542
[15]   Graph Attention Networks over Edge Content-Based Channels [J].
Lin, Lu ;
Wang, Hongning .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :1819-1827
[16]   Learning Selective Self-Mutual Attention for RGB-D Saliency Detection [J].
Liu, Nian ;
Zhang, Ni ;
Han, Junwei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13753-13762
[17]  
Kipf TN, 2017, Arxiv, DOI [arXiv:1609.02907, DOI 10.48550/ARXIV.1609.02907]
[18]   RGBD Salient Object Detection via Deep Fusion [J].
Qu, Liangqiong ;
He, Shengfeng ;
Zhang, Jiawei ;
Tian, Jiandong ;
Tang, Yandong ;
Yang, Qingxiong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (05) :2274-2285
[19]  
Romero A, 2015, Arxiv, DOI arXiv:1412.6550
[20]   Channel-wise Knowledge Distillation for Dense Prediction [J].
Shu, Changyong ;
Liu, Yifan ;
Gao, Jianfei ;
Yan, Zheng ;
Shen, Chunhua .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :5291-5300