Learning graph structures with transformer for weakly supervised semantic segmentation

被引：1

作者：

Sun, Wanchun ^{[1
]}

Feng, Xin ^{[1
,2
]}

Ma, Hui ^{[3
]}

Liu, Jingyao ^{[1
,4
]}

机构：

[1] Changchun Univ Sci & Technol, Sch Comp Sci & Technol, Changchun 130022, Peoples R China

[2] Changchun Univ Sci & Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China

[3] Anhui Vocat Coll Police Officers, Comp Basic Teaching & Res Dept, Hefei 232001, Peoples R China

[4] Chuzhou Univ, Sch Comp & Informat Engn, Chuzhou 239000, Peoples R China

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2023年 / 9卷 / 06期

关键词：

Weakly supervised; Transformer; Graph convolutional network; Semantic segmentation;

D O I：

10.1007/s40747-023-01152-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly supervised semantic segmentation (WSSS) is a challenging task of computer vision. The state-of-the-art semantic segmentation methods are usually based on the convolutional neural network (CNN), which mainly have the drawbacks of inability to explore the global information correctly and failure to activate potential object regions. To avoid such drawbacks, the transformer approach is explored in the WSSS task, but no effective semantic association between different patch tokens can be determined in the transformer. To address this issue, inspired by the graph convolutional network (GCN), this paper proposes a graph structure to learn the semantic category relationships between different blocks in the vector sequence. To verify the effectiveness of the proposed method in this paper, a large number of experiments were conducted on the publicly available PASCAL VOC2012 dataset. The experimental results show that our proposed method achieves significant performance improvement in the WSSS task and outperforms other state-of-the-art transformer-based methods.

引用

页码：7511 / 7521

页数：11

共 39 条

[11] Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation
Kolesnikov, Alexander
Lampert, Christoph H.
[J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 695 - 711
[12] FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference
Lee, Jungbeom
Kim, Eunji
Lee, Sungmin
Lee, Jangho
Yoon, Sungroh
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5262 - 5271
[13] Li R., 2022, ARXIV
[14] Li XY, 2021, AAAI CONF ARTIF INTE, V35, P1984
[15] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
[16] Qin J, 2021, ACTIVATION MODULATIO
[17] Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling
Ru, Lixiang
Du, Bo
Zhan, Yibing
Wu, Chen
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (04) : 1127 - 1144
[18] Improved YOLOv3 model with feature map cropping for multi-scale road object detection
Shen, Lingzhi
Tao, Hongfeng
Ni, Yuanzhi
Wang, Yue
Stojanovic, Vladimir
[J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (04)
[19] Shun-Yi Pan, 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME), P1, DOI 10.1109/ICME51207.2021.9428116
[20] Event-driven NN adaptive fixed-time control for nonlinear systems with guaranteed performance
Song, Xiaona
Sun, Peng
Song, Shuai
Stojanovic, Vladimir
[J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (09): : 4138 - 4159

← 1 2 3 4 →