Enhanced Context Learning with Transformer for Human Parsing

被引:1
|
作者
Song, Jingya [1 ,2 ,3 ]
Shi, Qingxuan [1 ,2 ,3 ]
Li, Yihang [1 ,2 ,3 ]
Yang, Fang [1 ,2 ,3 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China
[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期
关键词
human parsing; semantic segmentation; deep learning; SEGMENTATION;
D O I
10.3390/app12157821
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] FEANet: Foreground-edge-aware network with DenseASPOC for human parsing
    Yu, Wing-Yin
    Po, Lai-Man
    Zhao, Yuzhi
    Zhang, Yujia
    Lau, Kin-Wai
    IMAGE AND VISION COMPUTING, 2021, 109
  • [22] Human parsing by weak structural label
    Zhiyong Chen
    Si Liu
    Yanlong Zhai
    Jia Lin
    Xiaochun Cao
    Liang Yang
    Multimedia Tools and Applications, 2018, 77 : 19795 - 19809
  • [23] Enhanced Facade Parsing for Street-Level Images Using Convolutional Neural Networks
    Kong, Gefei
    Fan, Hongchao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10519 - 10531
  • [24] Parsing Address Texts with Deep Learning Method
    Delil, Selman
    Kuyumcu, Birol
    Aksakalli, Cuneyt
    Akcira, Isa Semih
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [25] Parsing Facades with Shape Grammars and Reinforcement Learning
    Teboul, Olivier
    Kokkinos, Iasonas
    Simon, Loic
    Koutsourakis, Panagiotis
    Paragios, Nikos
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (07) : 1744 - 1756
  • [26] GCAENet: global-class context with advanced edge network for single human parsing
    Zhang, Xiukun
    Liu, Weibin
    Xing, Weiwei
    Wei, Xiang
    VISUAL COMPUTER, 2023, 39 (12) : 6379 - 6394
  • [27] GCAENet: global-class context with advanced edge network for single human parsing
    Xiukun Zhang
    Weibin Liu
    Weiwei Xing
    Xiang Wei
    The Visual Computer, 2023, 39 : 6379 - 6394
  • [28] Human Parsing with Joint Learning for Dynamic mmWave Radar Point Cloud
    Wang, Shuai
    Cao, Dongjiang
    Liu, Ruofeng
    Jiang, Wenchao
    Yao, Tianshun
    Lu, Chris Xiaoxuan
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (01):
  • [29] Prior-Structure Driven Weakly-Supervised Learning for Fine-Grained Human Parsing
    Hao, Huaqing
    Liu, Weibin
    Xing, Weiwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 461 - 476
  • [30] Multilabel learning based adaptive graph convolutional network for human parsing
    Hao, Huaqing
    Liu, Weibin
    Xing, Weiwei
    Zhang, Shunli
    PATTERN RECOGNITION, 2022, 127