Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation

被引：11

作者：

Xu, Jianjun ^{[1
]}

Xie, Hongtao ^{[1
]}

Xu, Hai ^{[1
]}

Wang, Yuxin ^{[1
]}

Liu, Sun-ao ^{[1
]}

Zhang, Yongdong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

关键词：

weakly supervised semantic segmentation; weakly supervised learning; semantic segmentation; background bias; REPRESENTATION;

D O I：

10.1145/3503161.3548201

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Previous image-level weakly-supervised semantic segmentation methods based on Class Activation Map (CAM) have two limitations: 1) focusing on partial discriminative foreground regions and 2) containing undesirable background. The above issues are attributed to the spurious correlations between the object and background (semantic ambiguity) and the insufficient spatial perception ability of the classification network (spatial ambiguity). In this work, we propose a novel self-supervised framework to mitigate the semantic and spatial ambiguity from the perspectives of background bias and object perception. First, a background decoupling mechanism (BDM) is proposed to handle the semantic ambiguity by regularizing the consistency of predicted CAMs from the samples with identical foregrounds but different backgrounds. Thus, a decoupled relationship is constructed to reduce the dependence between the object instance and the scene information. Second, a global object-aware pooling (GOP) is introduced to alleviate spatial ambiguity. The GOP utilizes a learnable object-aware map to dynamically aggregate spatial information and further improve the performance of CAMs. Extensive experiments demonstrate the effectiveness of our method by achieving new state-of-the-art results on both the Pascal VOC 2012 and MS COCO 2014 datasets.

引用

页码：5783 / 5792

页数：10

共 55 条

[1] Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations [J].

Ahn, Jiwoon ;

Cho, Sunghyun ;

Kwak, Suha .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2204-2213

[2] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation [J].

Ahn, Jiwoon ;

Kwak, Suha .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4981-4990

[3] SLED: Semantic Label Embedding Dictionary Representation for Multilabel Image Annotation [J].

Cao, Xiaochun ;

Zhang, Hua ;

Guo, Xiaojie ;

Liu, Si ;

Meng, Dan .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (09) :2746-2759

[4]

Chen L., 2020, ECCV

[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[6]

Chen Zhaozheng, 2022, ARXIV220300962

[7] Attention-Based Dropout Layer for Weakly Supervised Single Object Localization and Semantic Segmentation [J].

Choe, Junsuk ;

Lee, Seungho ;

Shim, Hyunjung .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) :4256-4271

[8]

DeVries T., 2017, arXiv

[9] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[10]

Fan JS, 2020, AAAI CONF ARTIF INTE, V34, P10762

← 1 2 3 4 5 6 →