Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

被引:14
|
作者
Chung, Jiwoo [1 ]
Hyun, Sangeek [1 ]
Heo, Jae-Pil [1 ]
机构
[1] Sungkyunkwan Univ, Seoul, South Korea
关键词
D O I
10.1109/CVPR52733.2024.00840
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the impressive generative capabilities of diffusion models, existing diffusion model-based style transfer methods require inference-stage optimization (e.g. fine-tuning or textual inversion of style) which is time-consuming, or fails to leverage the generative ability of large-scale diffusion models. To address these issues, we introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization. Specifically, we manipulate the features of self-attention layers as the way the cross-attention mechanism works; in the generation process, substituting the key and value of content with those of style image. This approach provides several desirable characteristics for style transfer including 1) preservation of content by transferring similar styles into similar image patches and 2) transfer of style based on similarity of local texture (e.g. edge) between content and style images. Furthermore, we introduce query preservation and attention temperature scaling to mitigate the issue of disruption of original content, and initial latent Adaptive Instance Normalization (AdaIN) to deal with the disharmonious color (failure to transfer the colors of style). Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines. Codes are available at github.com/jiwoogit/StyleID.
引用
收藏
页码:8795 / 8805
页数:11
相关论文
共 37 条
  • [1] A Training-Free Latent Diffusion Style Transfer Method
    Xiang, Zhengtao
    Wan, Xing
    Xu, Libo
    Yu, Xin
    Mao, Yuhan
    INFORMATION, 2024, 15 (10)
  • [2] Training-Free Diffusion Models for Content-Style Synthesis
    Xu, Ruipeng
    Shen, Fei
    Xie, Xu
    Li, Zongyi
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT X, ICIC 2024, 2024, 14871 : 308 - 319
  • [3] Multi-Source Training-Free Controllable Style Transfer via Diffusion Models
    Yu, Cuihong
    Han, Cheng
    Zhang, Chao
    SYMMETRY-BASEL, 2025, 17 (02):
  • [4] Inversion-based Style Transfer with Diffusion Models
    Zhang, Yuxin
    Huang, Nisha
    Tang, Fan
    Huang, Haibin
    Ma, Chongyang
    Dong, Weiming
    Xu, Changsheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10146 - 10156
  • [5] Diffusion-Enhanced PatchMatch: A Framework for Arbitrary Style Transfer with Diffusion Models
    Hamazaspyan, Mark
    Navasardyan, Shant
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 797 - 805
  • [6] CartoonDiff: Training-free Cartoon Image Generation with Diffusion Transformer Models
    He, Feihong
    Li, Gang
    Si, Lingyu
    Yan, Leilei
    Hou, Shimeng
    Dong, Hongwei
    Li, Fanzhang
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3825 - 3829
  • [7] Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models
    Wang, Hongjie
    Liu, Difan
    Kang, Yan
    Lie, Yijun
    Line, Zhe
    Jha, Niraj K.
    Liu, Yuchen
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16080 - 16089
  • [8] FRDiff : Feature Reuse for Universal Training-Free Acceleration of Diffusion Models
    Sole, Junhyuk
    Lee, Jungwon
    Park, Eunhyeok
    COMPUTER VISION - ECCV 2024, PT LXXIII, 2025, 15131 : 328 - 344
  • [9] StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
    Wang, Zhizhong
    Zhao, Lei
    Xing, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7643 - 7655
  • [10] Large-Scale Analysis of Style Injection by Relative Path Overwrite
    Arshad, Sajjad
    Mirheidari, Seyed Ali
    Lauinger, Tobias
    Crispo, Bruno
    Kirda, Engin
    Robertson, William
    WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 237 - 246