Enhanced Context Learning with Transformer for Human Parsing

被引：1

作者：

Song, Jingya ^{[1
,2
,3
]}

Shi, Qingxuan ^{[1
,2
,3
]}

Li, Yihang ^{[1
,2
,3
]}

Yang, Fang ^{[1
,2
,3
]}

机构：

[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China

[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China

[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期

关键词：

human parsing; semantic segmentation; deep learning; SEGMENTATION;

D O I：

10.3390/app12157821

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.

引用

页数：16

共 50 条

[1] Feature context learning for human parsing
Tengteng Huang
Yongchao Xu
Song Bai
Yongpan Wang
Xiang Bai
Science China Information Sciences, 2019, 62
[2] Feature context learning for human parsing
Tengteng HUANG
Yongchao XU
Song BAI
Yongpan WANG
Xiang BAI
ScienceChina(InformationSciences), 2019, 62 (12) : 6 - 19
[3] Feature context learning for human parsing
Huang, Tengteng
Xu, Yongchao
Bai, Song
Wang, Yongpan
Bai, Xiang
SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (12)
[4] Relationship aware context adaptive deep learning for image parsing
Azam, Basim
Mandal, Ranju
Verma, Brijesh
INFORMATION SCIENCES, 2022, 607 : 506 - 518
[5] A Graph-based Context Learning Technique for Image Parsing
Azam, Basim
Verma, Brijesh
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[6] A Review on Deep Learning Techniques Applied to Human Parsing
Shao J.
Huang X.
Cao K.-T.
Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2019, 48 (05): : 644 - 654
[7] Deep Learning Technique for Human Parsing: A Survey and Outlook
Yang, Lu
Jia, Wenhe
Li, Shan
Song, Qing
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3270 - 3301
[8] Learning rebalanced human parsing model from imbalanced datasets
Huang, Enbo
Su, Zhuo
Zhou, Fan
Wang, Ruomei
IMAGE AND VISION COMPUTING, 2020, 99
[9] Object Part Parsing with Hierarchical Dual Transformer
Chen, Jiamin
Si, Jianlou
Liu, Naihao
Wu, Yao
Niu, Li
Qian, Chen
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2016 - 2024
[10] Human Parsing With Pyramidical Gather-Excite Context
Zhang, Sanyi
Qi, Guo-Jun
Cao, Xiaochun
Song, Zhanjie
Zhou, Jie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (03) : 1016 - 1030

← 1 2 3 4 5 →