Unabridged adjacent modulation for clothing parsing

被引:10
作者
Zhang, Dong [1 ]
Zuo, Chengting [1 ]
Wu, Qianhao [1 ]
Fu, Liyong [2 ,3 ]
Xiang, Xinguang [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Chinese Acad Forestry, Res Inst Forest Resource Informat Tech, Beijing 10 0 091, Beijing 100091, Peoples R China
[3] Natl Forestry & Grassland Adm, Key Lab Forest Management & Growth Modeling, Beijing 100091, Peoples R China
关键词
Encoder-decoder network; Clothing parsing; Attention learning; Features modulation; Self-supervised learning; JOINT IMAGE SEGMENTATION; NETWORK;
D O I
10.1016/j.patcog.2022.108594
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clothing parsing has made tremendous progress in the domain of computer vision recently. Most state-ofthe-art methods are based on the encoder-decoder architecture. However, the existing methods mainly neglect problems of feature uncalibration within blocks and semantics dilution between blocks. In this work, we propose an unabridged adjacent modulation network (UAM-Net) to aggregate multi-level features for clothing parsing. We first build an unabridged channel attention (UCA) mechanism on feature maps within each block for feature recalibration. We further design a top-down adjacent modulation (TAM) for decoder blocks. By deploying TAM, high-level semantic information and visual contexts can be gradually transferred into lower-level layers without loss. The joint implementation of UCA and TAM ensures that the encoder has an enhanced feature representation ability, and the low-level features of the decoders contain abundant semantic contexts. Quantitative and qualitative experimental results on two challenging benchmarks (i.e., colorful fashion parsing and the modified fashion clothing) declare that our proposed UAM-Net can achieve competitive high-accurate performance with the state-of-the-art methods. The source codes are available at: https://github.com/ctzuo/UAM-Net .(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 54 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   Attention Augmented Convolutional Networks [J].
Bello, Irwan ;
Zoph, Barret ;
Vaswani, Ashish ;
Shlens, Jonathon ;
Le, Quoc V. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294
[3]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[4]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[5]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[6]  
Chen YP, 2018, ADV NEUR IN, V31
[7]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[8]  
Dai JF, 2016, ADV NEUR IN, V29
[9]  
Dong H., 2020, C COMP VIS PATT REC
[10]   Towards Unified Human Parsing and Pose Estimation [J].
Dong, Jian ;
Chen, Qiang ;
Shen, Xiaohui ;
Yang, Jianchao ;
Yan, Shuicheng .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :843-850