Generative Transformer for Accurate and Reliable Salient Object Detection

被引:0
|
作者
Mao, Yuxin [1 ,2 ]
Zhang, Jing [3 ]
Wan, Zhexiong [1 ,2 ]
Tian, Xinyu [1 ,2 ]
Li, Aixuan [1 ,2 ]
Lv, Yunqiu [1 ,2 ]
Dai, Yuchao [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710072, Peoples R China
[2] Shaanxi Key Lab Informat Acquisit & Proc, Xian 710072, Peoples R China
[3] Australian Natl Univ, Sch Comp, Canberra, ACT 2601, Australia
基金
中国国家自然科学基金;
关键词
Transformers; Context modeling; Predictive models; Object detection; Accuracy; Reliability; Generative adversarial networks; Feature extraction; Decoding; Visualization; Vision transformer; salient object detection; inferential generative adversarial network; ATTENTION; NETWORK;
D O I
10.1109/TCSVT.2024.3469286
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We explore the impact of transformers on accurate and reliable salient object detection. For accuracy, we integrate the transformer with a deterministic model and delineate its advantages in structural modeling. Regarding reliability, we address the transformer's tendency to produce overly confident, incorrect predictions. To gauge reliability implicitly, we introduce a latent variable model within the transformer framework, termed the inferential generative adversarial network (iGAN). The stochastic nature of the latent variable facilitates the estimation of predictive uncertainty, which serves as an auxiliary measure of the model's prediction reliability. Different from the conventional GAN, which defines the distribution of the latent variable as fixed standard normal distribution N(0, I). The proposed iGAN infers the latent variable by gradient-based Markov Chain Monte Carlo (MCMC), namely Langevin dynamics, leading to an input-dependent latent variable model. We apply our proposed iGAN to fully supervised salient object detection, explaining that iGAN within the transformer framework leads to both accurate and reliable salient object detection.
引用
收藏
页码:1041 / 1054
页数:14
相关论文
共 50 条
  • [41] Recurrent Multi-scale Transformer for High-Resolution Salient Object Detection
    Deng, Xinhao
    Zhang, Pingping
    Liu, Wei
    Lu, Huchuan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7413 - 7423
  • [42] Lightweight cross-modal transformer for RGB-D salient object detection
    Huang, Nianchang
    Yang, Yang
    Zhang, Qiang
    Han, Jungong
    Huang, Jin
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [43] Adaptive Spatial Tokenization Transformer for Salient Object Detection in Optical Remote Sensing Images
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [44] Transformer-Based Light Field Salient Object Detection and Its Application to Autofocus
    Jiang, Yao
    Li, Xin
    Fu, Keren
    Zhao, Qijun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6647 - 6659
  • [45] CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection
    Sun, Fuming
    Ren, Peng
    Yin, Bowen
    Wang, Fasheng
    Li, Haojie
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2249 - 2262
  • [46] Bidirectional mutual guidance transformer for salient object detection in optical remote sensing images
    Huang, Kan
    Tian, Chunwei
    Li, Ge
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (13) : 4016 - 4033
  • [47] Mutual-Guidance Transformer-Embedding Network for Video Salient Object Detection
    Min, Dingyao
    Zhang, Chao
    Lu, Yukang
    Fu, Keren
    Zhao, Qijun
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1674 - 1678
  • [48] Salient Object Detection by Composition
    Feng, Jie
    Wei, Yichen
    Tao, Litian
    Zhang, Chao
    Sun, Jian
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1028 - 1035
  • [49] Spectral salient object detection
    Fu, Keren
    Gu, Irene Yu-Hua
    Yang, Jie
    NEUROCOMPUTING, 2018, 275 : 788 - 803
  • [50] Salient object detection: A survey
    Borji, Ali
    Cheng, Ming-Ming
    Hou, Qibin
    Jiang, Huaizu
    Li, Jia
    COMPUTATIONAL VISUAL MEDIA, 2019, 5 (02) : 117 - 150