Hierarchical Cross-Modal multianchor distillation for rail surface defect detection

被引：1

作者：

Wang, Bingying ^{[1
]}

Qiang, Fangfang ^{[1
]}

Zhou, Wujie ^{[1
,2
]}

机构：

[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore

来源：

MEASUREMENT | 2025年 / 253卷

基金：

中国国家自然科学基金;

关键词：

Transformer; Graph convolution network; Knowledge distillation; Cross-modal and cross-stage fusion; Rail defect inspection; SALIENT OBJECT DETECTION;

D O I：

10.1016/j.measurement.2025.117600

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

For red-green-blue with depth (RGB-D) rail surface defect detection (RSDD), most models focus on improving accuracy, while ignoring computational costs. To address this gap, we introduce a hierarchical cross-modal multi-anchor distillation network (HCMD) for RSDD. Moreover, we propose an innovative defect detection approach, comprising three key stages: merge, split, recombine. During feature extraction, multi-layer features are divided into high-and low-level guidance features using a cross-modal cross-stage fusion module. Subsequently, the learned semantic feature is used to guide the low-level feature fusion using a multiscale graph convolution. Finally, to reduce the computational effort, hierarchical cross-modal multi-anchor distillation loss is used to transfer the teacher's knowledge to the student, thus achieving model compression. Extensive experiments using the NEU RSDDS-AUG benchmark RGB-D dataset demonstrate our HCMD's superiority in terms of reduced computational complexity and high accuracy.

引用

页数：14

共 43 条

[11] Densely Connected Convolutional Networks [J].

Huang, Gao ;

Liu, Zhuang ;

van der Maaten, Laurens ;

Weinberger, Kilian Q. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269

[12]

Jiang XR, 2022, Arxiv, DOI arXiv:2207.03558

[13]

Ju R, 2014, IEEE IMAGE PROC, P1115, DOI 10.1109/ICIP.2014.7025222

[14] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection [J].

Li, Gongyang ;

Liu, Zhi ;

Chen, Minyu ;

Bai, Zhen ;

Lin, Weisi ;

Ling, Haibin .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3528-3542

[15] Graph Attention Networks over Edge Content-Based Channels [J].

Lin, Lu ;

Wang, Hongning .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :1819-1827

[16] Learning Selective Self-Mutual Attention for RGB-D Saliency Detection [J].

Liu, Nian ;

Zhang, Ni ;

Han, Junwei .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13753-13762

[17]

Kipf TN, 2017, Arxiv, DOI [arXiv:1609.02907, DOI 10.48550/ARXIV.1609.02907]

[18] RGBD Salient Object Detection via Deep Fusion [J].

Qu, Liangqiong ;

He, Shengfeng ;

Zhang, Jiawei ;

Tian, Jiandong ;

Tang, Yandong ;

Yang, Qingxiong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (05) :2274-2285

[19]

Romero A, 2015, Arxiv, DOI arXiv:1412.6550

[20] Channel-wise Knowledge Distillation for Dense Prediction [J].

Shu, Changyong ;

Liu, Yifan ;

Gao, Jianfei ;

Yan, Zheng ;

Shen, Chunhua .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :5291-5300

← 1 2 3 4 5 →