Cross-domain image translation with a novel style-guided diversity loss design

被引:2
|
作者
Li, Tingting [1 ]
Zhao, Huan [1 ]
Huang, Jing [3 ]
Li, Keqin [1 ,2 ]
机构
[1] Hunan Univ, Sch Informat Sci & Engn, Changsha 410082, Peoples R China
[2] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
[3] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Xiangtan 411100, Peoples R China
关键词
Cross-domain image-to-image (I2I) translation; Extracted style features; Generative Adversarial Networks (GAN); Multimodal; Multiple-domain; GENERATIVE ADVERSARIAL NETWORKS;
D O I
10.1016/j.knosys.2022.109731
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-domain image-to-image translation has made remarkable progress in recent years. It aims to map the image from the original image domain to the target domains so that the image can appear in diverse styles. Currently, existing methods are mainly based on Generative Adversarial Networks (GAN). They often employ an auxiliary encoder to extract style features from noises or reference images for the generator to translate new images. However, these approaches are usually feasible for two-domain translation and present low diversity in multi-domain translation since the extracted style features are simply served as additional input to the generator rather than fully utilized. This paper proposes a style-guided image-to-image translation (SG-I2IT) with a novel diversity regularization term named style-guided diversity loss (SD loss), making the best of the extracted style features. In our model, style features not only serve as the generator's input but also penalize the generator through the new SD loss, thus encouraging the model to capture the image styles better. The effectiveness of our method is demonstrated from two perspectives, noise-based and reference-based image translation. Qualitative and quantitative experiments validate our superiority of the proposed method against the state-of-the-art methods in terms of image quality and diversity. In addition, a user study demonstrates that the proposed method can better capture image styles and translate more realistic images. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Style-Guided Adversarial Teacher for Cross-Domain Object Detection
    Jia, Longfei
    Tian, Xianlong
    Hu, Yuguo
    Jing, Mengmeng
    Zuo, Lin
    Li, Wen
    ELECTRONICS, 2024, 13 (05)
  • [2] Unsupervised style-guided cross-domain adaptation for few-shot stylized face translation
    Jiaying Lan
    Fenghua Ye
    Zhenghua Ye
    Pingping Xu
    Wing-Kuen Ling
    Guoheng Huang
    The Visual Computer, 2023, 39 : 6167 - 6181
  • [3] Unsupervised style-guided cross-domain adaptation for few-shot stylized face translation
    Lan, Jiaying
    Ye, Fenghua
    Ye, Zhenghua
    Xu, Pingping
    Ling, Wing-Kuen
    Huang, Guoheng
    VISUAL COMPUTER, 2023, 39 (12): : 6167 - 6181
  • [4] OmniStyleGAN for Style-Guided Image-to-Image Translation
    Zhao, Qianyi
    Wang, Mengyin
    Zhang, Qing
    Wang, Fasheng
    Sun, Fuming
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 351 - 365
  • [5] Style-Guided Image-to-Image Translation for Multiple Domains
    Li, Tingting
    Zhao, Huan
    Wang, Song
    Huang, Jing
    MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 28 - 36
  • [6] Style-Guided and Disentangled Representation for Robust Image-to-Image Translation
    Choi, Jaewoong
    Kim, Daeha
    Song, Byung Cheol
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 463 - 471
  • [7] Unsupervised content and style learning for multimodal cross-domain image translation
    Lin, Zhijie
    Chen, Jingjing
    Ma, Xiaolong
    Li, Chao
    Zhang, Huiming
    Zhao, Lei
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [8] Image-to-image translation for cross-domain disentanglement
    Gonzalez-Garcia, Abel
    van de Weijer, Joost
    Bengio, Yoshua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [9] Discriminative Style Learning for Cross-Domain Image Captioning
    Yuan, Jin
    Zhu, Shuai
    Huang, Shuyin
    Zhang, Hanwang
    Xiao, Yaoqiang
    Li, Zhiyong
    Wang, Meng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1723 - 1736
  • [10] Enhancing Style-Guided Image-to-Image Translation via Self-Supervised Metric Learning
    Mao, Qi
    Ma, Siwei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8511 - 8526