Style-Guided Inference of Transformer for High-resolution Image Synthesis

被引:0
|
作者
Yim, Jonghwa [1 ]
Kim, Minjae [1 ]
机构
[1] NCSOFT, AI Ctr, Vis AI Lab, Seoul, South Korea
关键词
D O I
10.1109/WACV56688.2023.00179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer is eminently suitable for auto-regressive image synthesis which predicts discrete value from the past values recursively to make up full image. Especially, combined with vector quantised latent representation, the state-of-the-art auto-regressive transformer displays realistic high-resolution images. However, sampling the latent code from discrete probability distribution makes the output unpredictable. Therefore, it requires to generate lots of diverse samples to acquire desired outputs. To alleviate the process of generating lots of samples repetitively, in this article, we propose to take a desired output, a style image, as an additional condition without re-training the transformer. To this end, our method transfers the style to a probability constraint to re-balance the prior, thereby specifying the target distribution instead of the original prior. Thus, generated samples from the re-balanced prior have similar styles to reference style. In practice, we can choose either an image or a category of images as an additional condition. In our qualitative assessment, we show that styles of majority of outputs are similar to the input style.
引用
收藏
页码:1745 / 1755
页数:11
相关论文
共 50 条
  • [1] OmniStyleGAN for Style-Guided Image-to-Image Translation
    Zhao, Qianyi
    Wang, Mengyin
    Zhang, Qing
    Wang, Fasheng
    Sun, Fuming
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 351 - 365
  • [2] SGUNet: Style-guided UNet for adversely conditioned fundus image super-resolution
    Fan, Zhihao
    Dan, Tingting
    Liu, Baoyi
    Sheng, Xiaoqi
    Yu, Honghua
    Cai, Hongmin
    NEUROCOMPUTING, 2021, 465 : 238 - 247
  • [3] Style-Guided Image-to-Image Translation for Multiple Domains
    Li, Tingting
    Zhao, Huan
    Wang, Song
    Huang, Jing
    MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 28 - 36
  • [4] SGUNet: Style-guided UNet for adversely conditioned fundus image super-resolution
    Fan, Zhihao
    Dan, Tingting
    Liu, Baoyi
    Sheng, Xiaoqi
    Yu, Honghua
    Cai, Hongmin
    Yu, Honghua (yuhonghua@gdph.org.cn), 1600, Elsevier B.V. (465): : 238 - 247
  • [5] Efficient image restoration with style-guided context cluster and interaction
    Fengjuan Qiao
    Yonggui Zhu
    Ming Meng
    Neural Computing and Applications, 2024, 36 : 6973 - 6991
  • [6] Style-Guided and Disentangled Representation for Robust Image-to-Image Translation
    Choi, Jaewoong
    Kim, Daeha
    Song, Byung Cheol
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 463 - 471
  • [7] Efficient image restoration with style-guided context cluster and interaction
    Qiao, Fengjuan
    Zhu, Yonggui
    Meng, Ming
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 6973 - 6991
  • [8] StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning
    Jing, Peiguang
    Liu, Xianyi
    Wang, Ji
    Wei, Yinwei
    Nie, Liqiang
    Su, Yuting
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 853 - 861
  • [9] SSFlow: Style-guided Neural Spline Flows for Face Image Manipulation
    Liang, Hanbang
    Hou, Xianxu
    Shen, Linlin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 79 - 87
  • [10] SGDM: An Adaptive Style-Guided Diffusion Model for Personalized Text to Image Generation
    Xu, Yifei
    Xu, Xiaolong
    Gao, Honghao
    Xiao, Fu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9804 - 9813