Enhanced RGB-T saliency detection via thermal-guided multi-stage attention network

被引：1

作者：

Pang, Yu ^{[1
]}

Huang, Yang ^{[1
]}

Weng, Chenyu ^{[1
]}

Lyu, Jialin ^{[1
]}

Bai, Chuanyue ^{[1
]}

Yu, Xiaosheng ^{[2
]}

机构：

[1] Shenyang Univ Technol, Sch Artificial Intelligence, Shenyang 110870, Liaoning, Peoples R China

[2] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110169, Liaoning, Peoples R China

来源：

VISUAL COMPUTER | 2025年

基金：

中国国家自然科学基金;

关键词：

RGB-T saliency detection; Single-stream network; Multi-stage framework; Modality-interaction; Attention mechanism; FUSION;

D O I：

10.1007/s00371-025-03855-3

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Single-stream structures are prevalent in RGB-T saliency detection due to their efficiency and lightweight nature. However, existing multi-modal single-stream methods suffer from limited detection performance, primarily due to inadequate exploitation of thermal modality's strengths. To address this, we propose a novel single-stream network called Thermal-induced Modality-interaction Multi-stage Attention Network (TMMANet). Our approach leverages thermal-induced attention mechanisms in both the encoder and decoder stages to effectively integrate RGB and thermal modalities. In the encoder, a Thermal-induced Modality-interaction Self-Attention mechanism is introduced to extract powerful cross-modal features. In the decoder, a Thermal-induced Modality-interaction Dual-Branch Attention mechanism is designed to generate accurate saliency predictions by constructing modality-aware integration of foreground and background branches. Extensive experiments demonstrate that TMMANet outperforms most state-of-the-art RGB-T, RGB and RGB-D methods under various evaluation metrics, this highlights its effectiveness in enhancing RGB-T saliency detection performance. The related data of our TMMANet are released at https://github.com/SUTPangYu/TMMANet.

引用

页数：19