PoT-GAN: Pose Transform GAN for Person Image Synthesis

被引:11
作者
Li, Tianjiao [1 ]
Zhang, Wei [1 ]
Song, Ran [1 ]
Li, Zhiheng [1 ]
Liu, Jun [2 ]
Li, Xiaolei [1 ]
Lu, Shijian [3 ]
机构
[1] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China
[2] Singapore Univ Technol & Design, Informat Syst Technol & Design Pillar, Singapore 487372, Singapore
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
Transforms; Image synthesis; Generative adversarial networks; Generators; Training; Feature extraction; Computer architecture; pose transform; generative adversarial network;
D O I
10.1109/TIP.2021.3104183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pose-based person image synthesis aims to generate a new image containing a person with a target pose conditioned on a source image containing a person with a specified pose. It is challenging as the target pose is arbitrary and often significantly differs from the specified source pose, which leads to large appearance discrepancy between the source and the target images. This paper presents the Pose Transform Generative Adversarial Network (PoT-GAN) for person image synthesis where the generator explicitly learns the transform between the two poses by manipulating the corresponding multi-scale feature maps. By incorporating the learned pose transform information into the multi-scale feature maps of the source image in a GAN architecture, our method reliably transfers the appearance of the person in the source image to the target pose with no need for any hard-coded spatial information depicting the change of pose. According to both qualitative and quantitative results, the proposed PoT-GAN demonstrates a state-of-the-art performance on three publicly available datasets for person image synthesis.
引用
收藏
页码:7677 / 7688
页数:12
相关论文
共 47 条
[1]   PRECOCIOUS ADULT BEHAVIOUR IN YOUNG CHICK [J].
ANDREW, RJ .
ANIMAL BEHAVIOUR, 1966, 14 (04) :485-&
[2]  
[Anonymous], P 3 INT C LEARNING R
[3]   Synthesizing Images of Humans in Unseen Poses [J].
Balakrishnan, Guha ;
Zhao, Amy ;
Dalca, Adrian V. ;
Durand, Fredo ;
Guttag, John .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8340-8348
[5]   OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].
Cao, Zhe ;
Hidalgo, Gines ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186
[6]   Everybody Dance Now [J].
Chan, Caroline ;
Ginosar, Shiry ;
Zhou, Tinghui ;
Efros, Alexei A. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5932-5941
[7]  
Deng J., 2009, IEEE CVPR, P248
[8]   A Variational U-Net for Conditional Appearance and Shape Generation [J].
Esser, Patrick ;
Sutter, Ekaterina ;
Ommer, Bjoern .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8857-8866
[9]   Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator [J].
Fan, Yue ;
Chu, Shilei ;
Zhang, Wei ;
Song, Ran ;
Li, Yibin .
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :5209-5216
[10]  
Goodfellow I., 2020, ADV NEUR IN, V63, P139, DOI [DOI 10.1145/3422622, 10.1145/3422622]