Controllable face image editing in a disentanglement way

被引:0
作者
Zhou, Shiyan [1 ]
Wang, Ke [1 ]
Zhang, Jun [2 ]
Xia, Yi [1 ]
Chen, Peng [3 ]
Wang, Bing [4 ]
机构
[1] Anhui Univ, Sch Elect Engn & Automat, Hefei, Peoples R China
[2] Anhui Univ, Sch Artificial Intelligence, Hefei, Peoples R China
[3] Anhui Univ, Sch Internet, Hefei, Peoples R China
[4] Anhui Univ Technol, Sch Elect & Informat Engn, Maanshan, Peoples R China
基金
中国国家自然科学基金;
关键词
disentanglement; face editing; image processing; generative adversarial networks;
D O I
10.1117/1.JEI.32.4.043011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The prevalence of deep learning has attracted interest in the face image manipulation domain, especially in face editing with disentanglement representation. However, how to realize controllable disentangled representation of face images still remains challenging. Current methods require extensive supervision and training, or images will have a significantly impaired quality. We present an approach that learns how to represent data in an ideal disentangled way, with minimal supervision. Specifically, we use a swapping autoencoder with identity and attribute branches to learn identity and attribute representations, respectively. In addition, we separate the process of disentanglement and synthesis by an advanced pre-trained unsupervised StyleGAN2 image generator to make the entire network structure focus on learning data disentanglement. The identity and attribute vectors from different images are combined into a new representation that is mapped by a linear mapper into the generator's latent space to generate a new hybrid image. In this way, we take advantage of StyleGAN2's most advanced quality and its expressive latent space without the pressure of training a decoder. Experimental results prove that our method successfully separates the identity and other attributes of face images, outperforms existing methods, and requires less training and supervision. (c) 2023 SPIE and IS&T
引用
收藏
页数:14
相关论文
共 47 条
[1]   Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? [J].
Abdal, Rameen ;
Qin, Yipeng ;
Wonka, Peter .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4431-4440
[2]   Learning Character-Agnostic Motion for Motion Retargeting in 2D [J].
Aberman, Kfir ;
Wu, Rundi ;
Lischinski, Dani ;
Chen, Baoquan ;
Cohen-Or, Daniel .
ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04)
[3]  
Bounareli S, 2023, IEEE 17 INT C AUT FA, P1
[4]  
Bounareli S, 2022, Arxiv, DOI arXiv:2202.00046
[5]   Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving [J].
Cai, Mu ;
Zhang, Hong ;
Huang, Huijuan ;
Geng, Qichuan ;
Li, Yixuan ;
Huang, Gao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13910-13920
[6]   VGGFace2: A dataset for recognising faces across pose and age [J].
Cao, Qiong ;
Shen, Li ;
Xie, Weidi ;
Parkhi, Omkar M. ;
Zisserman, Andrew .
PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :67-74
[7]   Editing in Style: Uncovering the Local Semantics of GANs [J].
Collins, Edo ;
Bala, Raja ;
Price, Bob ;
Susstrunk, Sabine .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :5770-5779
[8]   ArcFace: Additive Angular Margin Loss for Deep Face Recognition [J].
Deng, Jiankang ;
Guo, Jia ;
Xue, Niannan ;
Zafeiriou, Stefanos .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4685-4694
[9]  
Dosovitskiy Alexey, 2016, Advances in Neural Information Processing Systems, V29
[10]  
Gabbay A., 2019, INT C LEARN REPR