FEditNet plus plus : Few-Shot Editing of Latent Semantics in GAN Spaces With Correlated Attribute Disentanglement

被引:1
作者
Yi, Ran [1 ]
Hu, Teng [1 ]
Xia, Mengfei [2 ]
Tang, Yizhe [1 ]
Liu, Yong-Jin [3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, BNRist, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Semantics; Accuracy; Correlation; Generative adversarial networks; Gold; Task analysis; Generators; Attribute disentanglement; StyleGAN latent space; few-shot attribute editing;
D O I
10.1109/TPAMI.2024.3432529
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative Adversarial Networks have achieved significant advancements in generating and editing high-resolution images. However, most methods suffer from either requiring extensive labeled datasets or strong prior knowledge. It is also challenging for them to disentangle correlated attributes with few-shot data. In this paper, we propose FEditNet++, a GAN-based approach to explore latent semantics. It aims to enable attribute editing with limited labeled data and disentangle the correlated attributes. We propose a layer-wise feature contrastive objective, which takes into consideration content consistency and facilitates the invariance of the unrelated attributes before and after editing. Furthermore, we harness the knowledge from the pretrained discriminative model to prevent overfitting. In particular, to solve the entanglement problem between the correlated attributes from data and semantic latent correlation, we extend our model to jointly optimize multiple attributes and propose a novel decoupling loss and cross-assessment loss to disentangle them from both latent and image space. We further propose a novel-attribute disentanglement strategy to enable editing of novel attributes with unknown entanglements. Finally, we extend our model to accurately edit the fine-grained attributes. Qualitative and quantitative assessments demonstrate that our method outperforms state-of-the-art approaches across various datasets, including CelebA-HQ, RaFD, Danbooru2018 and LSUN Church.
引用
收藏
页码:9975 / 9990
页数:16
相关论文
共 69 条
[61]  
Yu FS, 2016, Arxiv, DOI [arXiv:1506.03365, 10.48550/arXiv.1506.03365]
[62]  
Zhang H, 2020, Arxiv, DOI [arXiv:1910.12027, DOI 10.48550/ARXIV.1910.12027]
[63]   DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [J].
Zhang, Yuxuan ;
Ling, Huan ;
Gao, Jun ;
Yin, Kangxue ;
Lafleche, Jean-Francois ;
Barriuso, Adela ;
Torralba, Antonio ;
Fidler, Sanja .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :10140-10150
[64]   A Closer Look at Few-shot Image Generation [J].
Zhao, Yunqing ;
Ding, Henghui ;
Huang, Houjing ;
Cheung, Ngai-Man .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9130-9140
[65]  
Zhu J., 2022, INT C MACHINE LEARNI, P27612
[66]  
Zhu JP, 2023, Arxiv, DOI arXiv:2301.04604
[67]  
Zhu Jiapeng, 2021, Advances in Neural Information Processing Systems (NeurIPS), V34, P16648
[68]   Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks [J].
Zhu, Jun-Yan ;
Park, Taesung ;
Isola, Phillip ;
Efros, Alexei A. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2242-2251
[69]  
Zhuang P., 2021, P INT C LEARN REPR