Image-to-image translation using an offset-basedmulti-scale codes GAN encoder

被引：8

作者：

Guo, Zihao ^{[1
]}

Shao, Mingwen ^{[1
]}

Li, Shunhang ^{[1
]}

机构：

[1] China Univ Petr, Coll Comp Sci & Technol, Qingdao, Peoples R China

来源：

VISUAL COMPUTER | 2024年 / 40卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Generative adversarial networks; GAN inversion; Image-to-image translation; Super-resolution; Conditional face synthesis;

D O I：

10.1007/s00371-023-02810-4

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Despite the remarkable achievements of generative adversarial networks (GANs) in high-quality image synthesis, applying pre-trained GAN models to image-to-image translation is still challenging. Previous approaches typically map the conditional image into the latent spaces of GANs by per-image optimization or learning a GAN encoder. However, neither of these two methods can ideally perform image-to-image translation tasks. In this work, we propose a novel learning-based framework which can complete common image-to-image translation tasks with high quality in real-time based on pre-trained GANs. Specifically, to mitigate the semantic misalignment between conditional and synthesized images, we propose an offset-based image synthesis method that allows our encoder to use multiple rather than one forward propagation to predict the latent codes. During the multiple forward passes, the final latent codes are adjusted continuously according to the semantic difference between the conditional image and the current synthesized image. To further reduce the loss of details during encoding, we extract multiple latent codes at multiple scales from input instead of a single code to synthesize the image. Moreover, we propose an optional multiple feature maps fusion module that combines our encoder with different generators to implement our multiple latent codes strategies. Finally, we analyze the performance and demonstrate the effectiveness of our framework by comparing it with state-of-the-art works on super-resolution and conditional face synthesis tasks.

引用

页码：699 / 715

页数：17

共 50 条

[1] Image-to-image translation using an offset-based multi-scale codes GAN encoder
Guo, Zihao
Shao, Mingwen
Li, Shunhang
VISUAL COMPUTER, 2023,
[2] Image-to-image translation using an offset-based multi-scale codes GAN encoder
Zihao Guo
Mingwen Shao
Shunhang Li
The Visual Computer, 2024, 40 (2) : 699 - 715
[3] Asymmetric GAN for Unpaired Image-to-Image Translation
Li, Yu
Tang, Sheng
Zhang, Rui
Zhang, Yongdong
Li, Jintao
Yan, Shuicheng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (12) : 5881 - 5896
[4] SUNIT: multimodal unsupervised image-to-image translation with shared encoder
Lin, Liyuan
Ji, Shulin
Zhou, Yuan
Zhang, Shun
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (01)
[5] SPA-GAN: Spatial Attention GAN for Image-to-Image Translation
Emami, Hajar
Aliabadi, Majid Moradi
Dong, Ming
Chinnam, Ratna Babu
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 391 - 401
[6] Consistent Embedded GAN for Image-to-Image Translation
Xiong, Feng
Wang, Qianqian
Gao, Quanxue
IEEE ACCESS, 2019, 7 : 126651 - 126661
[7] Multimodal Unsupervised Image-to-Image Translation Without Independent Style Encoder
Sun, Yanbei
Lu, Yao
Lu, Haowei
Zhao, Qingjie
Wang, Shunzhou
MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 624 - 636
[8] Image-To-Image Translation Using a Cross-Domain Auto-Encoder and Decoder
Yoo, Jaechang
Eom, Heesong
Choi, Yong Suk
APPLIED SCIENCES-BASEL, 2019, 9 (22):
[9] Hypercomplex Image-to-Image Translation
Grassucci, Eleonora
Sigillo, Luigi
Uncini, Aurelio
Comminiello, Danilo
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[10] InvolutionGAN: lightweight GAN with involution for unsupervised image-to-image translation
Deng, Haipeng
Wu, Qiuxia
Huang, Han
Yang, Xiaowei
Wang, Zhiyong
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (22) : 16593 - 16605

← 1 2 3 4 5 →