HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection

被引:57
|
作者
Tang, Bin [1 ]
Liu, Zhengyi [2 ]
Tan, Yacheng [2 ]
He, Qian [2 ]
机构
[1] Hefei Univ, Sch Artificial Intelligence & Big Data, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
关键词
Task analysis; Convolution; Transformers; Object detection; Feature extraction; Convolutional neural networks; Streaming media; HRFormer; salient object detection; cross modality; RGB-D; RGB-T; light field; RGB-D IMAGE; NETWORK; FUSION;
D O I
10.1109/TCSVT.2022.3202563
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The High-Resolution Transformer (HRFormer) can maintain high-resolution representation and share global receptive fields. It is friendly towards salient object detection (SOD) in which the input and output have the same resolution. However, two critical problems need to be solved for two-modality SOD. One problem is two-modality fusion. The other problem is the HRFormer output's fusion. To address the first problem, a supplementary modality is injected into the primary modality by using global optimization and an attention mechanism to select and purify the modality at the input level. To solve the second problem, a dual-direction short connection fusion module is used to optimize the output features of HRFormer, thereby enhancing the detailed representation of objects at the output level. The proposed model, named HRTransNet, first introduces an auxiliary stream for feature extraction of supplementary modality. Then, features are injected into the primary modality at the beginning of each multi-resolution branch. Next, HRFormer is applied to achieve forwarding propagation. Finally, all the output features with different resolutions are aggregated by intra-feature and inter-feature interactive transformers. Application of the proposed model results in impressive improvement for driving two-modality SOD tasks, e.g., RGB-D, RGB-T, and light field SOD.https://github.com/liuzywen/HRTransNet
引用
收藏
页码:728 / 742
页数:15
相关论文
共 50 条
  • [31] TS-BiT: Two-Stage Binary Transformer for ORSI Salient Object Detection
    Zhang, Jinfeng
    Liu, Tianpeng
    Zhang, Jiehua
    Liu, Li
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
  • [32] Salient Object Detection Based on Visual Perceptual Saturation and Two-Stream Hybrid Networks
    Pan, Chen
    Liu, Jianfeng
    Yan, Wei Qi
    Cao, Feilong
    He, Wei
    Zhou, Yongxia
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4773 - 4787
  • [33] MULTI-MODALITY DIVERSITY FUSION NETWORK WITH SWINTRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
    Duan, Songsong
    Xia, Chenxing
    Gao, Xiuju
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1076 - 1080
  • [34] Modality Registration and Object Search Framework for UAV-Based Unregistered RGB-T Image Salient Object Detection
    Song, Kechen
    Wen, Hongwei
    Xue, Xiaotong
    Huang, Liming
    Ji, Yingying
    Yan, Yunhui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 15
  • [35] Weakly Supervised Salient Object Detection by Learning A Classifier-Driven Map Generator
    Hsu, Kuang-Jui
    Lin, Yen-Yu
    Chuang, Yung-Yu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (11) : 5435 - 5449
  • [36] RCNet: Related Context-Driven Network with Hierarchical Attention for Salient Object Detection
    Xia, Chenxing
    Sun, Yanguang
    Li, Kuan-Ching
    Ge, Bin
    Zhang, Hanling
    Jiang, Bo
    Zhang, Ji
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [37] Saliency Rank:Two-stage manifold ranking for salient object detection
    Wei Qi
    Ming-Ming Cheng
    Ali Borji
    Huchuan Lu
    Lian-Fa Bai
    ComputationalVisualMedia, 2015, 1 (04) : 309 - 320
  • [38] Salient Object Detection based on CNN Fusion of Two Types of Saliency Models
    Hassan, Muhammad Umair
    Niu, Dongmei
    Zhao, Xiuyang
    Shohag, Md Shakil Ahamed
    Ma, Yingjun
    Zhang, Mingxuan
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,
  • [39] Two-Stage Edge Reuse Network for Salient Object Detection of Strip Steel Surface Defects
    Han, Chengjun
    Li, Gongyang
    Liu, Zhi
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [40] CAFCNet: Cross-modality asymmetric feature complement network for RGB-T salient object detection
    Jin, Dongze
    Shao, Feng
    Xie, Zhengxuan
    Mu, Baoyang
    Chen, Hangwei
    Jiang, Qiuping
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 247