Hierarchical Cross-Modal multianchor distillation for rail surface defect detection

被引：1

作者：

Wang, Bingying ^{[1
]}

Qiang, Fangfang ^{[1
]}

Zhou, Wujie ^{[1
,2
]}

机构：

[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore

来源：

MEASUREMENT | 2025年 / 253卷

基金：

中国国家自然科学基金;

关键词：

Transformer; Graph convolution network; Knowledge distillation; Cross-modal and cross-stage fusion; Rail defect inspection; SALIENT OBJECT DETECTION;

D O I：

10.1016/j.measurement.2025.117600

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

For red-green-blue with depth (RGB-D) rail surface defect detection (RSDD), most models focus on improving accuracy, while ignoring computational costs. To address this gap, we introduce a hierarchical cross-modal multi-anchor distillation network (HCMD) for RSDD. Moreover, we propose an innovative defect detection approach, comprising three key stages: merge, split, recombine. During feature extraction, multi-layer features are divided into high-and low-level guidance features using a cross-modal cross-stage fusion module. Subsequently, the learned semantic feature is used to guide the low-level feature fusion using a multiscale graph convolution. Finally, to reduce the computational effort, hierarchical cross-modal multi-anchor distillation loss is used to transfer the teacher's knowledge to the student, thus achieving model compression. Extensive experiments using the NEU RSDDS-AUG benchmark RGB-D dataset demonstrate our HCMD's superiority in terms of reduced computational complexity and high accuracy.

引用

页数：14

共 43 条

[1] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond [J].

Chen, Hao ;

Shen, Feihong ;

Ding, Ding ;

Deng, Yongjian ;

Li, Chao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 :1699-1709

[2]

Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063

[3] CUFuse: Camera and Ultrasound Data Fusion for Rail Defect Detection [J].

Chen, Zhengxing ;

Wang, Qihang ;

He, Qing ;

Yu, Tianle ;

Zhang, Min ;

Wang, Ping .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) :21971-21983

[4] CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection [J].

Cong, Runmin ;

Lin, Qinwei ;

Zhang, Chen ;

Li, Chongyi ;

Cao, Xiaochun ;

Huang, Qingming ;

Zhao, Yao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6800-6815

[5] BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network [J].

Fan, Deng-Ping ;

Zhai, Yingjie ;

Borji, Ali ;

Yang, Jufeng ;

Shao, Ling .

COMPUTER VISION - ECCV 2020, PT XII, 2020, 12357 :275-292

[6] Local Background Enclosure for RGB-D Salient Object Detection [J].

Feng, David ;

Barnes, Nick ;

You, Shaodi ;

McCarthy, Chris .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2343-2350

[7] Double Similarity Distillation for Semantic Image Segmentation [J].

Feng, Yingchao ;

Sun, Xian ;

Diao, Wenhui ;

Li, Jihao ;

Gao, Xin .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :5363-5376

[8] Hierarchical Multi-Attention Transfer for Knowledge Distillation [J].

Gou, Jianping ;

Sun, Liyuan ;

Yu, Baosheng ;

Wan, Shaohua ;

Tao, Dacheng .

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (02)

[9]

Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, 10.48550/arXiv.1503.02531, DOI 10.48550/ARXIV.1503.02531]

[10] Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection [J].

Hu, Xihang ;

Sun, Fuming ;

Sun, Jing ;

Wang, Fasheng ;

Li, Haojie .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) :3067-3085

← 1 2 3 4 5 →