Learning graph structures with transformer for weakly supervised semantic segmentation

被引:1
作者
Sun, Wanchun [1 ]
Feng, Xin [1 ,2 ]
Ma, Hui [3 ]
Liu, Jingyao [1 ,4 ]
机构
[1] Changchun Univ Sci & Technol, Sch Comp Sci & Technol, Changchun 130022, Peoples R China
[2] Changchun Univ Sci & Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
[3] Anhui Vocat Coll Police Officers, Comp Basic Teaching & Res Dept, Hefei 232001, Peoples R China
[4] Chuzhou Univ, Sch Comp & Informat Engn, Chuzhou 239000, Peoples R China
关键词
Weakly supervised; Transformer; Graph convolutional network; Semantic segmentation;
D O I
10.1007/s40747-023-01152-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised semantic segmentation (WSSS) is a challenging task of computer vision. The state-of-the-art semantic segmentation methods are usually based on the convolutional neural network (CNN), which mainly have the drawbacks of inability to explore the global information correctly and failure to activate potential object regions. To avoid such drawbacks, the transformer approach is explored in the WSSS task, but no effective semantic association between different patch tokens can be determined in the transformer. To address this issue, inspired by the graph convolutional network (GCN), this paper proposes a graph structure to learn the semantic category relationships between different blocks in the vector sequence. To verify the effectiveness of the proposed method in this paper, a large number of experiments were conducted on the publicly available PASCAL VOC2012 dataset. The experimental results show that our proposed method achieves significant performance improvement in the WSSS task and outperforms other state-of-the-art transformer-based methods.
引用
收藏
页码:7511 / 7521
页数:11
相关论文
共 39 条
  • [11] Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation
    Kolesnikov, Alexander
    Lampert, Christoph H.
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 695 - 711
  • [12] FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference
    Lee, Jungbeom
    Kim, Eunji
    Lee, Sungmin
    Lee, Jangho
    Yoon, Sungroh
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5262 - 5271
  • [13] Li R., 2022, ARXIV
  • [14] Li XY, 2021, AAAI CONF ARTIF INTE, V35, P1984
  • [15] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Liu, Ze
    Lin, Yutong
    Cao, Yue
    Hu, Han
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Guo, Baining
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
  • [16] Qin J, 2021, ACTIVATION MODULATIO
  • [17] Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling
    Ru, Lixiang
    Du, Bo
    Zhan, Yibing
    Wu, Chen
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (04) : 1127 - 1144
  • [18] Improved YOLOv3 model with feature map cropping for multi-scale road object detection
    Shen, Lingzhi
    Tao, Hongfeng
    Ni, Yuanzhi
    Wang, Yue
    Stojanovic, Vladimir
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (04)
  • [19] Shun-Yi Pan, 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME), P1, DOI 10.1109/ICME51207.2021.9428116
  • [20] Event-driven NN adaptive fixed-time control for nonlinear systems with guaranteed performance
    Song, Xiaona
    Sun, Peng
    Song, Shuai
    Stojanovic, Vladimir
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (09): : 4138 - 4159