Self-Correction for Human Parsing

被引:148
作者
Li, Peike [1 ]
Xu, Yunqiu [1 ]
Wei, Yunchao [1 ]
Yang, Yi [1 ]
机构
[1] Univ Technol Sydney, Australian Artificial Intelligence Inst, ReLER Lab, Ultimo, NSW 2007, Australia
基金
澳大利亚研究理事会;
关键词
Training; Task analysis; Predictive models; Annotations; Semantics; Analytical models; Solid modeling; Human parsing; learning with label noise; fine-grained semantic segmentation; video human parsing;
D O I
10.1109/TPAMI.2020.3048039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Labeling pixel-level masks for fine-grained semantic segmentation tasks, e.g., human parsing, remains a challenging task. The ambiguous boundary between different semantic parts and those categories with similar appearances are usually confusing for annotators, leading to incorrect labels in ground-truth masks. These label noises will inevitably harm the training process and decrease the performance of the learned models. To tackle this issue, we introduce a noise-tolerant method in this work, called Self-Correction for Human Parsing (SCHP), to progressively promote the reliability of the supervised labels as well as the learned models. In particular, starting from a model trained with inaccurate annotations as initialization, we design a cyclically learning scheduler to infer more reliable pseudo masks by iteratively aggregating the current learned model with the former sub-optimal one in an online manner. Besides, those correspondingly corrected labels can in turn to further boost the model performance. In this way, the models and the labels will reciprocally become more robust and accurate during the self-correction learning cycles. Our SCHP is model-agnostic and can be applied to any human parsing models for further enhancing their performance. Extensive experiments on four human parsing models, including Deeplab V3+, CE2P, OCR and CE2P+, well demonstrate the effectiveness of the proposed SCHP. We achieve the new state-of-the-art results on 6 benchmarks, including LIP, Pascal-Person-Part and ATR for single human parsing, CIHP and MHP for multi-person human parsing and VIP for video human parsing tasks. In addition, benefiting the superiority of SCHP, we achieved the 1st place on all the three human parsing tracks in the 3rd Look Into Person Challenge. The code is available at https://github.com/PeikeLi/Self-Correction-Human-Parsing.
引用
收藏
页码:3260 / 3271
页数:12
相关论文
共 58 条
  • [1] [Anonymous], 2013, ICML WORKSHOP CHALLE
  • [2] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
    Berman, Maxim
    Triki, Amal Rannen
    Blaschko, Matthew B.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4413 - 4421
  • [3] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [5] Attention to Scale: Scale-aware Semantic Image Segmentation
    Chen, Liang-Chieh
    Yang, Yi
    Wang, Jiang
    Xu, Wei
    Yuille, Alan L.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3640 - 3649
  • [6] Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
  • [7] Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts
    Chen, Xianjie
    Mottaghi, Roozbeh
    Liu, Xiaobai
    Fidler, Sanja
    Urtasun, Raquel
    Yuille, Alan
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1979 - 1986
  • [8] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [9] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [10] Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
    Fang, Hao-Shu
    Lul, Guansong
    Fang, Xiaolin
    Xie, Jianwen
    Tai, Yu -Wing
    Lu, Cewu
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 70 - 78