Query-Selected Global Attention for Text guided Image Style Transfer using Diffusion Model

被引:0
|
作者
Hwang, Jungmin [1 ]
Lee, Won-Sook [1 ]
机构
[1] Univ Ottawa, Fac Engn, Sch EECS, Ottawa, ON, Canada
关键词
Diffusion; Style Transfer; Query Selection; Global Attention;
D O I
10.1109/CAI59869.2024.00207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have gained tremendous interest in image generation. Additionally, guided text methods for manipulating source images have shown successful progress. However, research on style transfer using diffusion models is still ongoing to address the trade-off between style transfer and content preservation. One representative solution to the issue is contrastive learning in a self-supervised manner, which is useful for extracting specific features from the same location on source and generated images for every pixel. However, there are instances where it is necessary to preserve certain areas, which contain more information from the source image compared to other areas in the image. Therefore, we propose anchoring the areas for preservation and intentionally selecting features at the anchor points through a query-selected global attention method. This enables our method to generate an image that preserves the content of the source while transferring the style without the need for additional fine-tuning or auxiliary network. Our diffusion model follows a simple architecture to enhance image quality and speed up inference time, in comparison to other diffusion methods. Our experimental results also demonstrate superior performance.
引用
收藏
页码:1162 / 1166
页数:5
相关论文
共 50 条
  • [1] Object-stable unsupervised dual contrastive learning image-to-image translation with query-selected attention and convolutional block attention module
    Oh, Yunseok
    Oh, Seonhye
    Noh, Sangwoo
    Kim, Hangyu
    Seo, Hyeon
    PLOS ONE, 2023, 18 (11):
  • [2] SGDM: An Adaptive Style-Guided Diffusion Model for Personalized Text to Image Generation
    Xu, Yifei
    Xu, Xiaolong
    Gao, Honghao
    Xiao, Fu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9804 - 9813
  • [3] Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
    Yang, Serin
    Hwang, Hyunmin
    Ye, Jong Chul
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22816 - 22825
  • [4] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [5] Enhanced Text-Guided Attention Model for Image Captioning
    Zhou, Yuanen
    Hu, Zhenzhen
    Zhao, Ye
    Liu, Xueliang
    Hong, Richang
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [6] Text-Guided Style Transfer-Based Image Manipulation Using Multimodal Generative Models
    Togo, Ren
    Kotera, Megumi
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE ACCESS, 2021, 9 : 64860 - 64870
  • [7] Diff-TST: Diffusion model for one-shot text-image style transfer
    Pang, Sizhe
    Chen, Xinyuan
    Xie, Yangchen
    Zhan, Hongjian
    Yin, Bing
    Lu, Yue
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 263
  • [8] Global-Guided Asymmetric Attention Network for Image-Text Matching
    Wu, Dongqing
    Li, Huihui
    Tang, Yinge
    Guo, Lei
    Liu, Hang
    NEUROCOMPUTING, 2022, 481 : 77 - 90
  • [9] Global-Guided Asymmetric Attention Network for Image-Text Matching
    Wu, Dongqing
    Li, Huihui
    Tang, Yinge
    Guo, Lei
    Liu, Hang
    Neurocomputing, 2022, 481 : 77 - 90
  • [10] Text Image Inpainting via Global Structure-Guided Diffusion Models
    Zhu, Shipeng
    Fang, Pengfei
    Zhu, Chenjie
    Zhao, Zuoyan
    Xu, Qiang
    Xue, Hui
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7775 - 7783