A Self-supervised Framework for Human Instance Segmentation

被引:1
作者
Jiang, Yalong [1 ]
Ding, Wenrui [1 ]
Li, Hongguang [1 ]
Yang, Hua [2 ]
Wang, Xu [2 ]
机构
[1] Beihang Univ, Unmanned Syst Res Inst, Beijing, Peoples R China
[2] HeyIntelligence Technol, Beijing, Peoples R China
来源
COMPUTER VISION - ECCV 2020 WORKSHOPS, PT II | 2020年 / 12536卷
关键词
Human instance segmentation; Prior knowledge; Self-supervised;
D O I
10.1007/978-3-030-66096-3_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing approaches for human-centered tasks such as human instance segmentation are focused on improving the architectures of models, leveraging weak supervision or transforming supervision among related tasks. Nonetheless, the structures are highly specific and the weak supervision is limited by available priors or number of related tasks. In this paper, we present a novel self-supervised framework for human instance segmentation. The framework includes one module which iteratively conducts mutual refinement between segmentation and optical flow estimation, and the other module which iteratively refines pose estimations by exploring the prior knowledge about the consistency in human graph structures from consecutive frames. The results of the proposed framework are employed for fine-tuning segmentation networks in a feedback fashion. Experimental results on the OCHuman and COCOPersons datasets demonstrate that the self-supervised framework achieves current state-of-the-art performance against existing models on the challenging datasets without requiring additional labels. Unlabeled video data is utilized together with prior knowledge to significantly improve performance and reduce the reliance on annotations. Code released at: https://github.com/AllenYLJiang/SSINS.
引用
收藏
页码:479 / 495
页数:17
相关论文
共 37 条
[1]   PoseTrack: A Benchmark for Human Pose Estimation and Tracking [J].
Andriluka, Mykhaylo ;
Iqbal, Umar ;
Insafutdinov, Eldar ;
Pishchulin, Leonid ;
Milan, Anton ;
Gall, Juergen ;
Schiele, Bernt .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5167-5176
[2]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[3]  
Chen LC, 2017, Arxiv, DOI [arXiv:1606.00915, DOI 10.1109/TPAMI.2017.2699184, 10.48550/arXiv.1606.00915, DOI 10.48550/ARXIV.1606.00915]
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[6]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[7]  
Chollet F, 2017, Arxiv, DOI [arXiv:1610.02357, DOI 10.48550/ARXIV.1610.02357]
[8]   Instance-aware Semantic Segmentation via Multi-task Network Cascades [J].
Dai, Jifeng ;
He, Kaiming ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3150-3158
[9]  
Dai JF, 2015, PROC CVPR IEEE, P3992, DOI 10.1109/CVPR.2015.7299025
[10]   Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer [J].
Fang, Hao-Shu ;
Lul, Guansong ;
Fang, Xiaolin ;
Xie, Jianwen ;
Tai, Yu -Wing ;
Lu, Cewu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :70-78