Cross-supervision-based equilibrated fusion mechanism of local and global attention for semantic segmentation

被引：0

作者：

Wenhao Yuan

Xiaoyan Lu

Rongfen Zhang

Yuhong Liu

机构：

[1] Guizhou University,College of Big Data and Information Engineering

来源：

Applied Intelligence | 2023年 / 53卷

关键词：

Semantic segmentation; Class activation mapping; Dense condition random fields; Cross-supervision; Multitask learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In recent years, weakly supervised semantic segmentation has become one of the hot research directions, but the problems of object location accuracy and small activation areas are still a challenge. In this paper, we propose a network with an equilibrated fusion mechanism of local and global attention based on cross-supervision(CSEFM). To accomplish multitask learning, we design the class activation branch and decoding prediction branch, which share the same backbone, to generate high-quality pseudo labels and obtain semantic segmentation results. Specifically, the network is first trained to update the backbone weights using images with strong labels. Images with weak labels are then used to help obtain a reliable class activation mapping (CAM) at the class activation branch, and the dense conditional random fields (DenseCRFs) are used to generate high-quality pseudo labels. Finally, the strong label images and the weak label images that have obtained pseudo labels are fed to the network for retraining, and the result of segmentation is predicted by the decoding prediction branch. Given the training method, the dataset is divided into several groups as a few images with strong labels and several images with weak labels and fed to the network successively for cross-supervised learning. Our proposed network is trained and validated on the PASCAL VOC 2012 dataset, and the results show that the mean Intersection over Union (mIoU) on the validation set is 65.6%. Compared with other mainstream methods, better segmentation is achieved and the performance gap between image-level and pixel-level semantic segmentation is reduced when using our approach.

引用

页码：11918 / 11933

页数：15

共 86 条

[1]

Chen LC(2017)Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs IEEE Trans Pattern Anal Mach Intell 40 834-848

[2]

Papandreou G(2019)Semantic understanding of scenes through the ade20k dataset Int J Comput Vis 127 302-321

[3]

Kokkinos I(2020)Matnet: motion-attentive transition network for zero-shot video object segmentation IEEE Trans Image Process 29 8326-8338

[4]

Murphy K(2018)Scribble-supervised segmentation of aerial building footprints using adversarial learning IEEE Access 6 58898-58911

[5]

Yuille AL(2016)Stc: a simple to complex framework for weakly-supervised semantic segmentation IEEE Trans Pattern Anal Mach Intell 39 2314-2320

[6]

Zhou B(2019)Decoupled spatial neural attention for weakly supervised semantic segmentation IEEE Trans Multimedia 21 2930-2941

[7]

Zhao H(2020)Deep clustering for weakly-supervised semantic segmentation in autonomous driving scenes Neurocomputing 381 20-28

[8]

Puig X(2021)Multiple instance graph learning for weakly supervised remote sensing object detection IEEE Trans Geosci Remote Sens 60 1-12

[9]

Xiao T(2021)Group-wise learning for weakly supervised semantic segmentation IEEE Trans Image Process 31 799-811

[10]

Fidler S(2020)Saliency guided self-attention network for weakly and semi-supervised semantic segmentation IEEE Access 8 14413-14423

← 1 2 3 4 5 6 7 8 9 →