Enhanced Context Learning with Transformer for Human Parsing

被引：1

作者：

Song, Jingya ^{[1
,2
,3
]}

Shi, Qingxuan ^{[1
,2
,3
]}

Li, Yihang ^{[1
,2
,3
]}

Yang, Fang ^{[1
,2
,3
]}

机构：

[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China

[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China

[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期

关键词：

human parsing; semantic segmentation; deep learning; SEGMENTATION;

D O I：

10.3390/app12157821

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.

引用

页数：16

共 50 条

[21] FEANet: Foreground-edge-aware network with DenseASPOC for human parsing
Yu, Wing-Yin
Po, Lai-Man
Zhao, Yuzhi
Zhang, Yujia
Lau, Kin-Wai
IMAGE AND VISION COMPUTING, 2021, 109
[22] Human parsing by weak structural label
Zhiyong Chen
Si Liu
Yanlong Zhai
Jia Lin
Xiaochun Cao
Liang Yang
Multimedia Tools and Applications, 2018, 77 : 19795 - 19809
[23] Enhanced Facade Parsing for Street-Level Images Using Convolutional Neural Networks
Kong, Gefei
Fan, Hongchao
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (12): : 10519 - 10531
[24] Parsing Address Texts with Deep Learning Method
Delil, Selman
Kuyumcu, Birol
Aksakalli, Cuneyt
Akcira, Isa Semih
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[25] Parsing Facades with Shape Grammars and Reinforcement Learning
Teboul, Olivier
Kokkinos, Iasonas
Simon, Loic
Koutsourakis, Panagiotis
Paragios, Nikos
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (07) : 1744 - 1756
[26] GCAENet: global-class context with advanced edge network for single human parsing
Zhang, Xiukun
Liu, Weibin
Xing, Weiwei
Wei, Xiang
VISUAL COMPUTER, 2023, 39 (12) : 6379 - 6394
[27] GCAENet: global-class context with advanced edge network for single human parsing
Xiukun Zhang
Weibin Liu
Weiwei Xing
Xiang Wei
The Visual Computer, 2023, 39 : 6379 - 6394
[28] Human Parsing with Joint Learning for Dynamic mmWave Radar Point Cloud
Wang, Shuai
Cao, Dongjiang
Liu, Ruofeng
Jiang, Wenchao
Yao, Tianshun
Lu, Chris Xiaoxuan
PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (01):
[29] Prior-Structure Driven Weakly-Supervised Learning for Fine-Grained Human Parsing
Hao, Huaqing
Liu, Weibin
Xing, Weiwei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 461 - 476
[30] Multilabel learning based adaptive graph convolutional network for human parsing
Hao, Huaqing
Liu, Weibin
Xing, Weiwei
Zhang, Shunli
PATTERN RECOGNITION, 2022, 127

← 1 2 3 4 5 →