Generative Transformer for Accurate and Reliable Salient Object Detection

被引:0
|
作者
Mao, Yuxin [1 ,2 ]
Zhang, Jing [3 ]
Wan, Zhexiong [1 ,2 ]
Tian, Xinyu [1 ,2 ]
Li, Aixuan [1 ,2 ]
Lv, Yunqiu [1 ,2 ]
Dai, Yuchao [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710072, Peoples R China
[2] Shaanxi Key Lab Informat Acquisit & Proc, Xian 710072, Peoples R China
[3] Australian Natl Univ, Sch Comp, Canberra, ACT 2601, Australia
基金
中国国家自然科学基金;
关键词
Transformers; Context modeling; Predictive models; Object detection; Accuracy; Reliability; Generative adversarial networks; Feature extraction; Decoding; Visualization; Vision transformer; salient object detection; inferential generative adversarial network; ATTENTION; NETWORK;
D O I
10.1109/TCSVT.2024.3469286
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We explore the impact of transformers on accurate and reliable salient object detection. For accuracy, we integrate the transformer with a deterministic model and delineate its advantages in structural modeling. Regarding reliability, we address the transformer's tendency to produce overly confident, incorrect predictions. To gauge reliability implicitly, we introduce a latent variable model within the transformer framework, termed the inferential generative adversarial network (iGAN). The stochastic nature of the latent variable facilitates the estimation of predictive uncertainty, which serves as an auxiliary measure of the model's prediction reliability. Different from the conventional GAN, which defines the distribution of the latent variable as fixed standard normal distribution N(0, I). The proposed iGAN infers the latent variable by gradient-based Markov Chain Monte Carlo (MCMC), namely Langevin dynamics, leading to an input-dependent latent variable model. We apply our proposed iGAN to fully supervised salient object detection, explaining that iGAN within the transformer framework leads to both accurate and reliable salient object detection.
引用
收藏
页码:1041 / 1054
页数:14
相关论文
共 50 条
  • [31] Mirror complementary transformer network for RGB-thermal salient object detection
    Jiang, Xiurong
    Hou, Yifan
    Tian, Hui
    Zhu, Lin
    IET COMPUTER VISION, 2024, 18 (01) : 15 - 32
  • [32] Salient object detection based on Pyramid Vision Transformer-gated network
    Zhou, Xiaoli
    Huo, Lina
    Wang, Wei
    Hao, Peng
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [33] NSAW: An Efficient and Accurate Transformer for Vehicle LiDAR Object Detection
    Hu, Yujie
    Li, Shaoxian
    Weng, Wenchao
    Xu, Kuiwen
    Wang, Gaofeng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [34] LARNet: Towards Lightweight, Accurate and Real-Time Salient Object Detection
    Wang, Zhenyu
    Zhang, Yunzhou
    Liu, Yan
    Qin, Cao
    Coleman, Sonya A.
    Kerr, Dermot
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5207 - 5222
  • [35] Delving into Calibrated Depth for Accurate RGB-D Salient Object Detection
    Li, Jingjing
    Ji, Wei
    Zhang, Miao
    Piao, Yongri
    Lu, Huchuan
    Cheng, Li
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (04) : 855 - 876
  • [36] Recursive Contour-Saliency Blending Network for Accurate Salient Object Detection
    Ke, Yun Yi
    Tsubono, Takahiro
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1360 - 1370
  • [37] Delving into Calibrated Depth for Accurate RGB-D Salient Object Detection
    Jingjing Li
    Wei Ji
    Miao Zhang
    Yongri Piao
    Huchuan Lu
    Li Cheng
    International Journal of Computer Vision, 2023, 131 : 855 - 876
  • [38] What is a Salient Object? A Dataset and a Baseline Model for Salient Object Detection
    Borji, Ali
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (02) : 742 - 756
  • [39] Collaborative spatial-temporal video salient object detection with cross attention transformer
    Su, Yuting
    Wang, Weikang
    Liu, Jing
    Jing, Peiguang
    SIGNAL PROCESSING, 2024, 224
  • [40] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
    Liu, Zhengyi
    Wang, Yuan
    Tu, Zhengzheng
    Xiao, Yun
    Tang, Bin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490