FCGNet: Foreground and Class Guided Network for human parsing

被引:0
作者
Jang, Jaehyuk [1 ]
Wang, Yooseung [1 ]
Kim, Changick [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
关键词
Human parsing; Semantic segmentation; Graph convolutional network;
D O I
10.1016/j.patcog.2024.110879
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding the inherent hierarchical human structure is key to human parsing. To capture the human- specific characteristic, it is necessary to focus on the spatial and class information corresponding to the foreground (i.e., human) in an image. Inspired by these insights, we introduce two supervision signals, spatial foreground information and existent class information in the image. By utilizing foreground information as guidance, the network is guided to generate a human-focused feature map and capture the pixel-wise hierarchical characteristics by computing correlations between pixels. Furthermore, we guide the network to consider class information in the image at the feature level and capture the class-wise relationship by calculating correlations between channels. Moreover, during the training phase, we prevent the network from misclassifying pixels into confusing classes by providing the existent class information in the image to the network at the prediction level. Our model achieves state-of-the-art performance with significantly reduced parameters and Multiply-Accumulate Operations (MACs) in three public benchmarks.
引用
收藏
页数:12
相关论文
共 57 条
[1]   Deep-Person: Learning discriminative deep features for person Re-Identification [J].
Bai, Xiang ;
Yang, Mingkun ;
Huang, Tengteng ;
Dou, Zhiyong ;
Yu, Rui ;
Xu, Yongchao .
PATTERN RECOGNITION, 2020, 98
[2]   The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks [J].
Berman, Maxim ;
Triki, Amal Rannen ;
Blaschko, Matthew B. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4413-4421
[3]   Structure-aware human pose estimation with graph convolutional networks [J].
Bin, Yanrui ;
Chen, Zhao-Min ;
Wei, Xiu-Shen ;
Chen, Xinya ;
Gao, Changxin ;
Sang, Nong .
PATTERN RECOGNITION, 2020, 106
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[7]   Graph convolutional network with structure pooling and joint-wise channel attention for action recognition [J].
Chen, Yuxin ;
Ma, Gaoqun ;
Yuan, Chunfeng ;
Li, Bing ;
Zhang, Hui ;
Wang, Fangshi ;
Hu, Weiming .
PATTERN RECOGNITION, 2020, 103
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]   A Deformable Mixture Parsing Model with Parselets [J].
Dong, Jian ;
Chen, Qiang ;
Xia, Wei ;
Huang, Zhongyang ;
Yan, Shuicheng .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :3408-3415
[10]   Disentangled Cycle Consistency for Highly-realistic Virtual Try-On [J].
Ge, Chongjian ;
Song, Yibing ;
Ge, Yuying ;
Yang, Han ;
Liu, Wei ;
Luo, Ping .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :16923-16932