Mask-Guided Deformation Adaptive Network for Human Parsing

被引:1
作者
Mao, Aihua [1 ]
Liang, Yuan [1 ]
Jiao, Jianbo [2 ]
Liu, Yongtuo [1 ]
He, Shengfeng [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Univ Oxford, Dept Engn Sci, Parks Rd, Oxford OX1 3PJ, England
基金
中国国家自然科学基金;
关键词
Human parsing; multi-task learning; deformable convolution; PERSON REIDENTIFICATION; SINGLE; MODELS;
D O I
10.1145/3467889
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the challenges of densely compacted body parts, nonrigid clothing items, and severe overlap in crowd scenes, human parsing needs to focus more on multilevel feature representations compared to general scene parsing tasks. Based on this observation, we propose to introduce the auxiliary task of human mask and edge detection to facilitate human parsing. Different from human parsing, which exploits the discriminative features of each category, human mask and edge detection emphasizes the boundaries of semantic parsing regions and the difference between foreground humans and background clutter, which benefits the parsing predictions of crowd scenes and small human parts. Specifically, we extract human mask and edge labels from the human parsing annotations and train a shared encoder with three independent decoders for the three mutually beneficial tasks. Furthermore, the decoder feature maps of the human mask prediction branch are further exploited as attention maps, indicating human regions to facilitate the decoding process of human parsing and human edge detection. In addition to these auxiliary tasks, we further alleviate the problem of deformed clothing items under various human poses by tracking the deformation patterns with the deformable convolution. Extensive experiments show that the proposed method can achieve superior performance against state-of-the-art methods on both single and multiple human parsing datasets. Codes and trained models are available https://github.com/ViktorLiang/MGDAN.
引用
收藏
页数:20
相关论文
共 54 条
[1]   Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation [J].
Bilinski, Piotr ;
Prisacariu, Victor .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6596-6605
[2]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[3]   Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform [J].
Chen, Liang-Chieh ;
Barron, Jonathan T. ;
Papandreou, George ;
Murphy, Kevin ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4545-4554
[4]   Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts [J].
Chen, Xianjie ;
Mottaghi, Roozbeh ;
Liu, Xiaobai ;
Fidler, Sanja ;
Urtasun, Raquel ;
Yuille, Alan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1979-1986
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]   Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer [J].
Fang, Hao-Shu ;
Lul, Guansong ;
Fang, Xiaolin ;
Xie, Jianwen ;
Tai, Yu -Wing ;
Lu, Cewu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :70-78
[7]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645
[8]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[9]   Instance-Level Human Parsing via Part Grouping Network [J].
Gong, Ke ;
Liang, Xiaodan ;
Li, Yicheng ;
Chen, Yimin ;
Yang, Ming ;
Lin, Liang .
COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 :805-822
[10]  
Gong K, 2019, PROC CVPR IEEE, P7442, DOI [10.1109/CVPR.2019.00763, 10.1109/cvpr.2019.00763]