CTIF-Net: A CNN-Transformer Iterative Fusion Network for Salient Object Detection

被引:15
|
作者
Yuan, Junbin [1 ]
Zhu, Aiqing [1 ]
Xu, Qingzhen [1 ]
Wattanachote, Kanoksak [2 ]
Gong, Yongyi [2 ]
机构
[1] South China Normal Univ, Sch Comp Sci, Guangzhou 510631, Peoples R China
[2] Guangdong Univ Foreign Studies, Sch Informat Sci & Technol, Intelligent Hlth & Visual Comp Lab, Guangzhou 510006, Peoples R China
关键词
CNN; transformer; iterative fusion; salient object detection; ATTENTION; MODEL;
D O I
10.1109/TCSVT.2023.3321190
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Capturing sufficient global context and rich spatial structure information is critical for dense prediction tasks. Convolutional Neural Network (CNN) is particularly adept at modeling fine-grained local features, while Transformer excels at modeling global context information. It is evident that CNN and Transformer exhibit complementary characteristics. Exploring the design of a network, that efficiently fuses these two models to leverage their strengths fully and achieve more accurate detection, represents a promising and worthwhile research topic. In this paper, we introduce a novel CNN-Transformer Iterative Fusion Network (CTIF-Net) for salient object detection. It efficiently combines CNN and Transformer to achieve superior performance by using a parallel dual encoder structure and a feature iterative fusion module. Firstly, CTIF-Net extracts features from the image using the CNN and the Transformer, respectively. Secondly, two feature convertors and a feature iterative fusion module are employed to combine and iteratively refine the two sets of features. The experimental results on multiple SOD datasets show that CTIF-Net outperforms 17 state-of-the-art methods, achieving higher performance in various mainstream evaluation metrics such as F-measure, S-measure, and MAE value. Code can be found at https://github.com/danielfaster/CTIF-Net/.
引用
收藏
页码:3795 / 3805
页数:11
相关论文
共 50 条
  • [31] Feature extraction and fusion network for salient object detection
    Chao Dai
    Chen Pan
    Wei He
    Multimedia Tools and Applications, 2022, 81 : 33955 - 33969
  • [32] Selective feature fusion network for salient object detection
    Sun, Fengming
    Yuan, Xia
    Zhao, Chunxia
    IET COMPUTER VISION, 2023, 17 (04) : 483 - 495
  • [33] Transformers and CNNs fusion network for salient object detection
    Yao, Cuili
    Feng, Lin
    Kong, Yuqiu
    Xiao, Lin
    Chen, Tao
    NEUROCOMPUTING, 2023, 520 : 342 - 355
  • [34] CNN-TransNet: A Hybrid CNN-Transformer Network With Differential Feature Enhancement for Cloud Detection
    Ma, Nan
    Sun, Lin
    He, Yawen
    Zhou, Chenghu
    Dong, Chuanxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [35] Hybrid CNN-transformer network for efficient CSI feedback
    Zhao, Ruohan
    Liu, Ziang
    Song, Tianyu
    Jin, Jiyu
    Jin, Guiyue
    Fan, Lei
    PHYSICAL COMMUNICATION, 2024, 66
  • [36] Image harmonization with Simple Hybrid CNN-Transformer Network
    Li, Guanlin
    Zhao, Bin
    Li, Xuelong
    NEURAL NETWORKS, 2024, 180
  • [37] Hybrid CNN-Transformer Feature Fusion for Single Image Deraining
    Chen, Xiang
    Pan, Jinshan
    Lu, Jiyang
    Fan, Zhentao
    Li, Hao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 378 - 386
  • [38] CNN-Transformer Hybrid Architecture for Early Fire Detection
    Yang, Chenyue
    Pan, Yixuan
    Cao, Yichao
    Lu, Xiaobo
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 570 - 581
  • [39] RGB-D Salient Object Detection by a CNN With Multiple Layers Fusion
    Huang, Rui
    Xing, Yan
    Wang, ZeZheng
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (04) : 552 - 556
  • [40] Salient Object Detection based on CNN Fusion of Two Types of Saliency Models
    Hassan, Muhammad Umair
    Niu, Dongmei
    Zhao, Xiuyang
    Shohag, Md Shakil Ahamed
    Ma, Yingjun
    Zhang, Mingxuan
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,