HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection

被引：57

作者：

Tang, Bin ^{[1
]}

Liu, Zhengyi ^{[2
]}

Tan, Yacheng ^{[2
]}

He, Qian ^{[2
]}

机构：

[1] Hefei Univ, Sch Artificial Intelligence & Big Data, Hefei 230601, Peoples R China

[2] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 02期

关键词：

Task analysis; Convolution; Transformers; Object detection; Feature extraction; Convolutional neural networks; Streaming media; HRFormer; salient object detection; cross modality; RGB-D; RGB-T; light field; RGB-D IMAGE; NETWORK; FUSION;

D O I：

10.1109/TCSVT.2022.3202563

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The High-Resolution Transformer (HRFormer) can maintain high-resolution representation and share global receptive fields. It is friendly towards salient object detection (SOD) in which the input and output have the same resolution. However, two critical problems need to be solved for two-modality SOD. One problem is two-modality fusion. The other problem is the HRFormer output's fusion. To address the first problem, a supplementary modality is injected into the primary modality by using global optimization and an attention mechanism to select and purify the modality at the input level. To solve the second problem, a dual-direction short connection fusion module is used to optimize the output features of HRFormer, thereby enhancing the detailed representation of objects at the output level. The proposed model, named HRTransNet, first introduces an auxiliary stream for feature extraction of supplementary modality. Then, features are injected into the primary modality at the beginning of each multi-resolution branch. Next, HRFormer is applied to achieve forwarding propagation. Finally, all the output features with different resolutions are aggregated by intra-feature and inter-feature interactive transformers. Application of the proposed model results in impressive improvement for driving two-modality SOD tasks, e.g., RGB-D, RGB-T, and light field SOD.https://github.com/liuzywen/HRTransNet

引用

页码：728 / 742

页数：15

共 50 条

[31] TS-BiT: Two-Stage Binary Transformer for ORSI Salient Object Detection
Zhang, Jinfeng
Liu, Tianpeng
Zhang, Jiehua
Liu, Li
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
[32] Salient Object Detection Based on Visual Perceptual Saturation and Two-Stream Hybrid Networks
Pan, Chen
Liu, Jianfeng
Yan, Wei Qi
Cao, Feilong
He, Wei
Zhou, Yongxia
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4773 - 4787
[33] MULTI-MODALITY DIVERSITY FUSION NETWORK WITH SWINTRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
Duan, Songsong
Xia, Chenxing
Gao, Xiuju
Ge, Bin
Zhang, Hanling
Li, Kuan-Ching
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1076 - 1080
[34] Modality Registration and Object Search Framework for UAV-Based Unregistered RGB-T Image Salient Object Detection
Song, Kechen
Wen, Hongwei
Xue, Xiaotong
Huang, Liming
Ji, Yingying
Yan, Yunhui
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 15
[35] Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator
Hsu, Kuang-Jui
Lin, Yen-Yu
Chuang, Yung-Yu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (11) : 5435 - 5449
[36] RCNet: Related Context-Driven Network with Hierarchical Attention for Salient Object Detection
Xia, Chenxing
Sun, Yanguang
Li, Kuan-Ching
Ge, Bin
Zhang, Hanling
Jiang, Bo
Zhang, Ji
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[37] Saliency Rank:Two-stage manifold ranking for salient object detection
Wei Qi
Ming-Ming Cheng
Ali Borji
Huchuan Lu
Lian-Fa Bai
ComputationalVisualMedia, 2015, 1 (04) : 309 - 320
[38] Salient Object Detection based on CNN Fusion of Two Types of Saliency Models
Hassan, Muhammad Umair
Niu, Dongmei
Zhao, Xiuyang
Shohag, Md Shakil Ahamed
Ma, Yingjun
Zhang, Mingxuan
2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,
[39] Two-Stage Edge Reuse Network for Salient Object Detection of Strip Steel Surface Defects
Han, Chengjun
Li, Gongyang
Liu, Zhi
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[40] CAFCNet: Cross-modality asymmetric feature complement network for RGB-T salient object detection
Jin, Dongze
Shao, Feng
Xie, Zhengxuan
Mu, Baoyang
Chen, Hangwei
Jiang, Qiuping
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 247

← 1 2 3 4 5 →