Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

被引:76
作者
Kulharia, Viveka [2 ]
Chandra, Siddhartha [1 ]
Agrawal, Amit [1 ]
Torr, Philip [2 ]
Tyagi, Ambrish [1 ]
机构
[1] Amazon Lab126, Sunnyvale, CA 94089 USA
[2] Univ Oxford, Oxford, England
来源
COMPUTER VISION - ECCV 2020, PT XXVII | 2020年 / 12372卷
关键词
D O I
10.1007/978-3-030-58583-9_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a weakly supervised approach to semantic segmentation using bounding box annotations. Bounding boxes are treated as noisy labels for the foreground objects. We predict a per-class attention map that saliently guides the per-pixel cross entropy loss to focus on foreground pixels and refines the segmentation boundaries. This avoids propagating erroneous gradients due to incorrect foreground labels on the background. Additionally, we learn pixel embeddings to simultaneously optimize for high intra-class feature affinity while increasing discrimination between features across different classes. Our method, Box2Seg, achieves state-of-the-art segmentation accuracy on PASCAL VOC 2012 by significantly improving the mIOU metric by 2.1% compared to previous weakly supervised approaches. Our weakly supervised approach is comparable to the recent fully supervised methods when fine-tuned with limited amount of pixel-level annotations. Qualitative results and ablation studies show the benefit of different loss terms on the overall performance.
引用
收藏
页码:290 / 308
页数:19
相关论文
共 59 条
[1]   Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations [J].
Ahn, Jiwoon ;
Cho, Sunghyun ;
Kwak, Suha .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2204-2213
[2]   Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation [J].
Ahn, Jiwoon ;
Kwak, Suha .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4981-4990
[3]  
Arandjelovic R, 2019, Arxiv, DOI arXiv:1905.11369
[4]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[5]  
Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
[6]   Dense and Low-Rank Gaussian CRFs Using Deep Embeddings [J].
Chandra, Siddhartha ;
Usunier, Nicolas ;
Kokkinos, Iasonas .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5113-5122
[7]  
Chaudhry A., 2017, BRIT MACH VIS C BMVC
[8]  
Chen LC, 2016, Arxiv, DOI arXiv:1412.7062
[9]  
Chen LC, 2017, Arxiv, DOI [arXiv:1606.00915, DOI 10.1109/TPAMI.2017.2699184, 10.48550/arXiv.1606.00915, DOI 10.48550/ARXIV.1606.00915]
[10]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587