Spatial attention guided cGAN for improved salient object detection

被引:2
|
作者
Dhara, Gayathri [1 ]
Kumar, Ravi Kant [1 ]
机构
[1] SRM Univ, Dept Comp Sci Engn, Vijayawada, Andhra Pradesh, India
来源
关键词
computer vision; Conditional Generative Adversarial Networks (cGANs); context information; encoder-decoder framework; image segmentation; salient object detection; spatial attention;
D O I
10.3389/fcomp.2024.1420965
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent research shows that Conditional Generative Adversarial Networks (cGANs) are effective for Salient Object Detection (SOD), a challenging computer vision task that mimics the way human vision focuses on important parts of an image. However, implementing cGANs for this task has presented several complexities, including instability during training with skip connections, weak generators, and difficulty in capturing context information for challenging images. These challenges are particularly evident when dealing with input images containing small salient objects against complex backgrounds, underscoring the need for careful design and tuning of cGANs to ensure accurate segmentation and detection of salient objects. To address these issues, we propose an innovative method for SOD using a cGAN framework. Our method utilizes encoder-decoder framework as the generator component for cGAN, enhancing the feature extraction process and facilitating accurate segmentation of the salient objects. We incorporate Wasserstein-1 distance within the cGAN training process to improve the accuracy of finding the salient objects and stabilize the training process. Additionally, our enhanced model efficiently captures intricate saliency cues by leveraging the spatial attention gate with global average pooling and regularization. The introduction of global average pooling layers in the encoder and decoder paths enhances the network's global perception and fine-grained detail capture, while the channel attention mechanism, facilitated by dense layers, dynamically modulates feature maps to amplify saliency cues. The generated saliency maps are evaluated by the discriminator for authenticity and gives feedback to enhance the generator's ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. We trained and validated our model using large-scale benchmark datasets commonly used for salient object detection, namely DUTS, ECSSD, and DUT-OMRON. Our approach was evaluated using standard performance metrics on these datasets. Precision, recall, MAE and F beta score metrics are used to evaluate performance. Our method achieved the lowest MAE values: 0.0292 on the ECSSD dataset, 0.033 on the DUTS-TE dataset, and 0.0439 on the challenging and complex DUT-OMRON dataset, compared to other state-of-the-art methods. Our proposed method demonstrates significant improvements in salient object detection, highlighting its potential benefits for real-life applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] CGAN: closure-guided attention network for salient object detection
    Dibyendu Kumar Das
    Sahadeb Shit
    Dip Narayan Ray
    Somajyoti Majumder
    The Visual Computer, 2022, 38 : 3803 - 3817
  • [2] CGAN: closure-guided attention network for salient object detection
    Das, Dibyendu Kumar
    Shit, Sahadeb
    Ray, Dip Narayan
    Majumder, Somajyoti
    VISUAL COMPUTER, 2022, 38 (11): : 3803 - 3817
  • [3] Correction to: CGAN: closure-guided attention network for salient object detection
    Dibyendu Kumar Das
    Sahadeb Shit
    Dip Narayan Ray
    Somajyoti Majumder
    The Visual Computer, 2023, 39 : 5987 - 5987
  • [4] Attention and boundary guided salient object detection
    Zhang, Qing
    Shi, Yanjiao
    Zhang, Xueqin
    PATTERN RECOGNITION, 2020, 107 (107)
  • [5] Spatial attention-guided deformable fusion network for salient object detection
    Yang, Aiping
    Liu, Yan
    Cheng, Simeng
    Cao, Jiale
    Ji, Zhong
    Pang, Yanwei
    MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2563 - 2573
  • [6] Spatial attention-guided deformable fusion network for salient object detection
    Aiping Yang
    Yan Liu
    Simeng Cheng
    Jiale Cao
    Zhong Ji
    Yanwei Pang
    Multimedia Systems, 2023, 29 : 2563 - 2573
  • [7] Motion Guided Attention for Video Salient Object Detection
    Li, Haofeng
    Chen, Guanqi
    Li, Guanbin
    Yu, Yizhou
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7273 - 7282
  • [8] SDETR: ATTENTION-GUIDED SALIENT OBJECT DETECTION WITH TRANSFORMER
    Liu, Guanze
    Xu, Bo
    Huang, Han
    Lu, Cheng
    Guo, Yandong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1611 - 1615
  • [9] Progressive Attention Guided Recurrent Network for Salient Object Detection
    Zhang, Xiaoning
    Wang, Tiantian
    Qi, Jinqing
    Lu, Huchuan
    Wang, Gang
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 714 - 722
  • [10] Spatial Attention-Guided Light Field Salient Object Detection Network With Implicit Neural Representation
    Zheng, Xin
    Li, Zhengqu
    Liu, Deyang
    Zhou, Xiaofei
    Shan, Caifeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12437 - 12449