Multi-Granularity Denoising and Bidirectional Alignment for Weakly Supervised Semantic Segmentation

被引:22
作者
Chen, Tao [1 ]
Yao, Yazhou [1 ]
Tang, Jinhui [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Training; Noise measurement; Task analysis; Cams; Semantic segmentation; Filtering; Adversarial machine learning; weak supervision; image-level label; saliency map; noisy label; SALIENT OBJECT DETECTION;
D O I
10.1109/TIP.2023.3275913
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised semantic segmentation (WSSS) models relying on class activation maps (CAMs) have achieved desirable performance comparing to the non-CAMs-based counterparts. However, to guarantee WSSS task feasible, we need to generate pseudo labels by expanding the seeds from CAMs which is complex and time-consuming, thus hindering the design of efficient end-to-end (single-stage) WSSS approaches. To tackle the above dilemma, we resort to the off-the-shelf and readily accessible saliency maps for directly obtaining pseudo labels given the image-level class labels. Nevertheless, the salient regions may contain noisy labels and cannot seamlessly fit the target objects, and saliency maps can only be approximated as pseudo labels for simple images containing single-class objects. As such, the achieved segmentation model with these simple images cannot generalize well to the complex images containing multi-class objects. To this end, we propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model, to alleviate the noisy label and multi-class generalization issues. Specifically, we propose the online noise filtering and progressive noise detection modules to tackle image-level and pixel-level noise, respectively. Moreover, a bidirectional alignment mechanism is proposed to reduce the data distribution gap at both input and output space with simple-to-complex image synthesis and complex-to-simple adversarial learning. MDBA can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset. The source codes and models have been made available at https://github.com/NUST-MachineIntelligence-Laboratory/MDBA.
引用
收藏
页码:2960 / 2971
页数:12
相关论文
共 73 条
[1]   Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations [J].
Ahn, Jiwoon ;
Cho, Sunghyun ;
Kwak, Suha .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2204-2213
[2]   Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation [J].
Ahn, Jiwoon ;
Kwak, Suha .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4981-4990
[3]  
[Anonymous], 2015, P ICLR
[4]   Single-Stage Semantic Segmentation from Image Labels [J].
Araslanov, Nikita ;
Roth, Stefan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4252-4261
[5]  
Arpit D, 2017, PR MACH LEARN RES, V70
[6]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[7]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[8]   Salient Object Detection: A Benchmark [J].
Borji, Ali ;
Cheng, Ming-Ming ;
Jiang, Huaizu ;
Li, Jia .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5706-5722
[9]   Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks [J].
Bousmalis, Konstantinos ;
Silberman, Nathan ;
Dohan, David ;
Erhan, Dumitru ;
Krishnan, Dilip .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :95-104
[10]   End-to-end Boundary Exploration forWeakly-supervised Semantic Segmentation [J].
Chen, Jianjun ;
Fang, Shancheng ;
Xie, Hongtao ;
Zha, Zhengjun ;
Hu, Yue ;
Tan, Jianlong .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :2381-2390