Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

被引:15
作者
Chen, Tao [1 ]
Yao, Yazhou [1 ]
Huang, Xingguo [2 ]
Li, Zechao [1 ]
Nie, Liqiang [3 ]
Tang, Jinhui [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Jilin Univ, Coll Instrumentat & Elect Engn, Changchun 130061, Peoples R China
[3] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Cams; Semantic segmentation; Image reconstruction; Training; Task analysis; Annotations; Semantics; weak supervision; image-level label; spatial structure constraints; FEATURE ALIGNMENT; NETWORKS;
D O I
10.1109/TIP.2024.3359041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The image-level label has prevailed in weakly supervised semantic segmentation tasks due to its easy availability. Since image-level labels can only indicate the existence or absence of specific categories of objects, visualization-based techniques have been widely adopted to provide object location clues. Considering class activation maps (CAMs) can only locate the most discriminative part of objects, recent approaches usually adopt an expansion strategy to enlarge the activation area for more integral object localization. However, without proper constraints, the expanded activation will easily intrude into the background region. In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion. Specifically, we propose a CAM-driven reconstruction module to directly reconstruct the input image from deep CAM features, which constrains the diffusion of last-layer object attention by preserving the coarse spatial structure of the image content. Moreover, we propose an activation self-modulation module to refine CAMs with finer spatial structure details by enhancing regional consistency. Without external saliency models to provide background clues, our approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively, demonstrating the superiority of our proposed approach.
引用
收藏
页码:1136 / 1148
页数:13
相关论文
共 97 条
[1]   Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations [J].
Ahn, Jiwoon ;
Cho, Sunghyun ;
Kwak, Suha .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2204-2213
[2]   Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation [J].
Ahn, Jiwoon ;
Kwak, Suha .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4981-4990
[3]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[4]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   Weakly Supervised Semantic Segmentation with Boundary Exploration [J].
Chen, Liyi ;
Wu, Weiwei ;
Fu, Chenchen ;
Han, Xiao ;
Zhang, Yuntao .
COMPUTER VISION - ECCV 2020, PT XXVI, 2020, 12371 :347-362
[7]   Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation [J].
Chen, Qi ;
Yang, Lingxiao ;
Lai, Jianhuang ;
Xie, Xiaohua .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4278-4288
[8]   Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly Supervised Semantic Segmentation [J].
Chen, Tao ;
Yao, Yazhou ;
Zhang, Lei ;
Wang, Qiong ;
Xie, Guo-Sen ;
Shen, Fumin .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :1727-1737
[9]   Multi-Granularity Denoising and Bidirectional Alignment for Weakly Supervised Semantic Segmentation [J].
Chen, Tao ;
Yao, Yazhou ;
Tang, Jinhui .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :2960-2971
[10]   Enhanced Feature Alignment for Unsupervised Domain Adaptation of Semantic Segmentation [J].
Chen, Tao ;
Wang, Shui-Hua ;
Wang, Qiong ;
Zhang, Zheng ;
Xie, Guo-Sen ;
Tang, Zhenmin .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :1042-1054