CLMFNet: Cross-Level Multimodal Fusion Network for RGB-T Semantic Segmentation of Distribution Network Lines

被引：0

作者：

Du, Rui ^{[1
]}

Zhang, Hui ^{[1
]}

Zhong, Hang

Huang, Zhihong ^{[2
,3
]}

Wang, Yaonan ^{[4
]}

机构：

[1] Hunan Univ, Sch Robot, Changsha 410012, Peoples R China

[2] State Grid Hunan Elect Power Co, Elect Power Res Inst, Changsha 410017, Peoples R China

[3] Hunan Prov Xiangdian Elect Power Expt Res Inst, Changsha 410017, Peoples R China

[4] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2025年

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Semantic segmentation; Power transmission lines; Accuracy; Substations; Thermal sensors; Transformers; Semantics; Three-dimensional displays; Monitoring; Distribution network lines; multimodal fusion; RGB-thermal fusion; semantic segmentation;

D O I：

10.1109/TII.2025.3578136

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Accurate semantic segmentation is crucial in distribution network line monitoring to ensure the system's reliability and security. Due to the complexity of the environment and the diversity of devices, unimodal images (such as RGB images) struggle to provide enough information for effective segmentation. To address these challenges, the complementary nature of RGB and thermal infrared (TIR) images is leveraged to significantly enhance segmentation performance. Therefore, an innovative cross-level multimodal fusion network (CLMFNet) is proposed to improve the accuracy and robustness of semantic segmentation by integrating RGB and TIR data. A dual-branch architecture is employed in CLMFNet to extract features from both RGB and TIR images, which are then effectively integrated through a multimodal fusion strategy. Additionally, a cross-layer guidance mechanism is introduced to facilitate the complementation and optimization of features across different levels. CLMFNet was validated on a custom RGB-T dataset, and experimental results showed that it outperformed state-of-the-art methods in key metrics such as mean accuracy (mAcc) and mean intersection over union (mIoU), demonstrating its effectiveness in performing semantic segmentation in complex power distribution scenarios.

引用

页数：11

共 37 条

[1] PLGAN: Generative Adversarial Networks for Power-Line Segmentation in Aerial Images [J].

Abdelfattah, Rabab ;

Wang, Xiaofeng ;

Wang, Song .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :6248-6259

[2] Deep image captioning using an ensemble of CNN and LSTM based deep neural networks [J].

Alzubi, Jafar A. ;

Jain, Rachna ;

Nagrath, Preeti ;

Satapathy, Suresh ;

Taneja, Soham ;

Gupta, Paras .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) :5761-5769

[3] Chip to chip communication through the photonic integrated circuit: A new paradigm to optical VLSI [J].

Amiri, I. S. ;

Palai, G. ;

Alzubi, Jafar A. ;

Nayak, Soumya Ranjan .

OPTIK, 2020, 202

[4] Unequal adaptive visual recognition by learning from multi-modal data [J].

Cai, Ziyun ;

Zhang, Tengfei ;

Jing, Xiao-Yuan ;

Shao, Ling .

INFORMATION SCIENCES, 2022, 600 :1-21

[5] RGB-D datasets using microsoft kinect or similar sensors: a survey [J].

Cai, Ziyun ;

Han, Jungong ;

Liu, Li ;

Shao, Ling .

MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) :4313-4355

[6] Combined Fault Location and Classification for Power Transmission Lines Fault Diagnosis With Integrated Feature Extraction [J].

Chen, Yann Qi ;

Fink, Olga ;

Sansavini, Giovanni .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2018, 65 (01) :561-569

[7] Attention-Based Multimodal Image Feature Fusion Module for Transmission Line Detection [J].

Choi, Hyeyeon ;

Yun, Jong Pil ;

Kim, Bum Jun ;

Jang, Hyeonah ;

Kim, Sang Woo .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (11) :7686-7695

[8] Real-time Power Line Detection Network using Visible Light and Infrared Images [J].

Choi, Hyeyeon ;

Koo, Gyogwon ;

Kim, Bum Jun ;

Kim, Sang Woo .

2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,

[9] CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection [J].

Cong, Runmin ;

Lin, Qinwei ;

Zhang, Chen ;

Li, Chongyi ;

Cao, Xiaochun ;

Huang, Qingming ;

Zhao, Yao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6800-6815

[10]

Ha Q, 2017, IEEE INT C INT ROBOT, P5108, DOI 10.1109/IROS.2017.8206396

← 1 2 3 4 →