Multilevel attention imitation knowledge distillation for RGB-thermal transmission line detection

被引:4
作者
Guo, Xiaodong [1 ]
Zhou, Wujie [2 ,3 ]
Liu, Tong [1 ]
机构
[1] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
[2] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore
基金
中国国家自然科学基金;
关键词
Transmission line detection; Convolutional neural networks; Multi-modal; Knowledge distillation; SALIENT OBJECT DETECTION; NETWORK; FUSION;
D O I
10.1016/j.eswa.2024.125406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transmission line detection (TLD) plays a crucial role in ensuring the safety and stability of electricity supply. Applying RGB-thermal convolutional neural networks (CNNs) to unmanned aerial vehicles (UAVs) photography is a valuable alternative for diagnosing transmission line faults. However, existing CNNs struggle to generalize to TLD due to the clustered backgrounds and variable weather conditions. In addition, the limited computational resources and storage space of UAVs pose challenges for the lightweight design of models. In the present study, we developed a novel multilevel attention imitation knowledge distillation structure comprising a highperforming teacher model called MAINet-T and a compact student model called MAINet-S. We aimed to 1) improve the accuracy and robustness of TLD and 2) optimize the performance and capacity of the model for deployment on UAVs. The MAINet-T has a three-stage feature aggregation module and a detailed enhancement module to facilitate the processes of multi-modal and multilevel feature complement and interaction. To balance model performance and capacity for deployment, we proposed a novel KD strategy, including response distillation and feature distillation, to obtain an optimized model called MAINet-S*. Within feature distillation, we proposed a multilevel attention imitation module to integrate the advantages of the attention maps in different stages of the encoder. In experiments based on the VITLD dataset, MAINet-S* outperformed 15 state-of-the-art methods, with a 66.2% reduction in the number of weight parameters (Params) and a 69.9% increase in floating-point operations (FLOPs) compared with MAINet-T.
引用
收藏
页数:13
相关论文
共 56 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]   CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation [J].
Amara, Ibtihel ;
Ziaeefard, Maryam ;
Meyer, Brett H. ;
Gross, Warren ;
Clark, James J. .
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, :1901-1907
[3]  
Ba J, 2014, ACS SYM SER
[4]   Salient Object Detection: A Benchmark [J].
Borji, Ali ;
Cheng, Ming-Ming ;
Jiang, Huaizu ;
Li, Jia .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5706-5722
[5]   Knowledge Distillation with the Reused Teacher Classifier [J].
Chen, Defang ;
Mei, Jian-Ping ;
Zhang, Hailin ;
Wang, Can ;
Feng, Yan ;
Chen, Chun .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11923-11932
[6]  
Chen ZY, 2020, Arxiv, DOI arXiv:2003.08608
[7]  
Choi H., 2019, INT C IM VIS COMP, P1
[8]   Attention-Based Multimodal Image Feature Fusion Module for Transmission Line Detection [J].
Choi, Hyeyeon ;
Yun, Jong Pil ;
Kim, Bum Jun ;
Jang, Hyeonah ;
Kim, Sang Woo .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (11) :7686-7695
[9]   Weakly supervised power line detection algorithm using a recursive noisy label update with refined broken line segments [J].
Choi, Hyeyeon ;
Koo, Gyogwon ;
Kim, Bum Jun ;
Kim, Sang Woo .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[10]   CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection [J].
Cong, Runmin ;
Lin, Qinwei ;
Zhang, Chen ;
Li, Chongyi ;
Cao, Xiaochun ;
Huang, Qingming ;
Zhao, Yao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6800-6815