Grammar-Induced Wavelet Network for Human Parsing

被引:6
作者
Zhang, Xiaomei [1 ,2 ]
Chen, Yingying [1 ,2 ]
Tang, Ming [1 ,2 ]
Lei, Zhen [3 ,4 ,5 ]
Wang, Jinqiao [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Chinese Acad Sci CASIA, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
[4] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[5] Chinese Acad Sci, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
关键词
Grammar; Feature extraction; Wavelet transforms; Task analysis; Predictive models; Image edge detection; Image segmentation; Human parsing; grammar-induced wavelet network; blended grammar-induced module; wavelet prediction module; NEURAL-NETWORK; MODELS; SEGMENTATION; PARTS; POSE;
D O I
10.1109/TIP.2022.3181486
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing methods of human parsing still face a challenge: how to extract the accurate foreground from similar or cluttered scenes effectively. In this paper, we propose a Grammar-induced Wavelet Network (GWNet), to deal with the challenge. GWNet mainly consists of two modules, including a blended grammar-induced module and a wavelet prediction module. We design the blended grammar-induced module to exploit the relationship of different human parts and the inherent hierarchical structure of a human body by means of grammar rules in both cascaded and paralleled manner. In this way, conspicuous parts, which are easily distinguished from the background, can amend the segmentation of inconspicuous ones, improving the foreground extraction. We also design a Part-aware Convolutional Recurrent Neural Network (PCRNN) to pass messages which are generated by grammar rules. To further improve the performance, we propose a wavelet prediction module to capture the basic structure and the edge details of a person by decomposing the low-frequency and high-frequency components of features. The low-frequency component can represent the smooth structures and the high-frequency components can describe the fine details. We conduct extensive experiments to evaluate GWNet on PASCAL-Person-Part, LIP, and PPSS datasets. GWNet obtains state-of-the-art performance on these human parsing datasets.
引用
收藏
页码:4502 / 4514
页数:13
相关论文
共 65 条
[1]   POP:: Patchwork of parts models for object recognition [J].
Amit, Yali ;
Trouve, Alain .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 75 (02) :267-282
[2]   Polarimetric SAR Image Semantic Segmentation With 3D Discrete Wavelet Transform and Markov Random Field [J].
Bi, Haixia ;
Xu, Lin ;
Cao, Xiangyong ;
Xue, Yong ;
Xu, Zongben .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :6601-6614
[3]  
Chen LC, 2018, ADV NEUR IN, V31
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[7]   Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts [J].
Chen, Xianjie ;
Mottaghi, Roozbeh ;
Liu, Xiaobai ;
Fidler, Sanja ;
Urtasun, Raquel ;
Yuille, Alan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1979-1986
[8]  
Daubechies I., 1992, 10 LECT WAVELETS SOC
[9]   Semantic Segmentation With Context Encoding and Multi-Path Decoding [J].
Ding, Henghui ;
Jiang, Xudong ;
Shuai, Bing ;
Liu, Ai Qun ;
Wang, Gang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3520-3533
[10]   SAR Image segmentation based on convolutional-wavelet neural network and markov random field [J].
Duan, Yiping ;
Liu, Fang ;
Jiao, Licheng ;
Zhao, Peng ;
Zhang, Lu .
PATTERN RECOGNITION, 2017, 64 :255-267