HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection

被引:57
|
作者
Tang, Bin [1 ]
Liu, Zhengyi [2 ]
Tan, Yacheng [2 ]
He, Qian [2 ]
机构
[1] Hefei Univ, Sch Artificial Intelligence & Big Data, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei 230601, Peoples R China
关键词
Task analysis; Convolution; Transformers; Object detection; Feature extraction; Convolutional neural networks; Streaming media; HRFormer; salient object detection; cross modality; RGB-D; RGB-T; light field; RGB-D IMAGE; NETWORK; FUSION;
D O I
10.1109/TCSVT.2022.3202563
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The High-Resolution Transformer (HRFormer) can maintain high-resolution representation and share global receptive fields. It is friendly towards salient object detection (SOD) in which the input and output have the same resolution. However, two critical problems need to be solved for two-modality SOD. One problem is two-modality fusion. The other problem is the HRFormer output's fusion. To address the first problem, a supplementary modality is injected into the primary modality by using global optimization and an attention mechanism to select and purify the modality at the input level. To solve the second problem, a dual-direction short connection fusion module is used to optimize the output features of HRFormer, thereby enhancing the detailed representation of objects at the output level. The proposed model, named HRTransNet, first introduces an auxiliary stream for feature extraction of supplementary modality. Then, features are injected into the primary modality at the beginning of each multi-resolution branch. Next, HRFormer is applied to achieve forwarding propagation. Finally, all the output features with different resolutions are aggregated by intra-feature and inter-feature interactive transformers. Application of the proposed model results in impressive improvement for driving two-modality SOD tasks, e.g., RGB-D, RGB-T, and light field SOD.https://github.com/liuzywen/HRTransNet
引用
收藏
页码:728 / 742
页数:15
相关论文
共 50 条
  • [1] Salient Object Detection in Optical Remote Sensing Images Driven by Transformer
    Li, Gongyang
    Bai, Zhen
    Liu, Zhi
    Zhang, Xinpeng
    Ling, Haibin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5257 - 5269
  • [2] Curiosity-Driven Salient Object Detection With Fragment Attention
    Wang, Zheng
    Wang, Pengzhi
    Han, Yahong
    Zhang, Xue
    Sun, Meijun
    Tian, Qi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5989 - 6001
  • [3] Improving RGB-D Salient Object Detection via Modality-Aware Decoder
    Song, Mengke
    Song, Wenfeng
    Yang, Guowei
    Chen, Chenglizhao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6124 - 6138
  • [4] Double cross-modality progressively guided network for RGB-D salient object detection
    Yao, Cuili
    Feng, Lin
    Kong, Yuqiu
    Li, Shengming
    Li, Hang
    IMAGE AND VISION COMPUTING, 2022, 117
  • [5] Multi-Prior Driven Network for RGB-D Salient Object Detection
    Zhang, Xiaoqin
    Xu, Yuewang
    Wang, Tao
    Liao, Tangfei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9209 - 9222
  • [6] CGMDRNet: Cross-Guided Modality Difference Reduction Network for RGB-T Salient Object Detection
    Chen, Gang
    Shao, Feng
    Chai, Xiongli
    Chen, Hangwei
    Jiang, Qiuping
    Meng, Xiangchao
    Ho, Yo-Sung
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) : 6308 - 6323
  • [7] HDNet: Multi-Modality Hierarchy-Aware Decision Network for RGB-D Salient Object Detection
    Xia, Chengxing
    Duan, Songsong
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2577 - 2581
  • [8] Flow driven attention network for video salient object detection
    Zhou, Feng
    Shuai, Hui
    Liu, Qingshan
    Guo, Guodong
    IET IMAGE PROCESSING, 2020, 14 (06) : 997 - 1004
  • [9] Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection
    Chen, Gang
    Shao, Feng
    Chai, Xiongli
    Chen, Hangwei
    Jiang, Qiuping
    Meng, Xiangchao
    Ho, Yo-Sung
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1787 - 1801
  • [10] Background-Driven Salient Object Detection
    Wang, Zilei
    Xiang, Dao
    Hou, Saihui
    Wu, Feng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (04) : 750 - 762