S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using StripsWindow Attention

被引:0
作者
Zhang, Chiyu [1 ,2 ]
Xu, Xiaogang [3 ,4 ]
Wang, Lei [1 ]
Dai, Zaiyan [1 ]
Yang, Jun [1 ,5 ]
机构
[1] Sichuan Normal Univ, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Nanjing, Jiangsu, Peoples R China
[3] Zhejiang Lab, Hangzhou, Zhejiang, Peoples R China
[4] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[5] Visual Comp & Virtual Real Key Lab Sichuan Prov, Chengdu, Sichuan, Peoples R China
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7 | 2024年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer's recent integration into style transfer leverages its proficiency in establishing long-range dependencies, albeit at the expense of attenuated local modeling. This paper introduces StripsWindow Attention Transformer (S2WAT), a novel hierarchical vision transformer designed for style transfer. S2WAT employs attention computation in diverse window shapes to capture both short- and long-range dependencies. The merged dependencies utilize the "Attn Merge" strategy, which adaptively determines spatial weights based on their relevance to the target. Extensive experiments on representative datasets show the proposed method's effectiveness compared to state-of-the-art (SOTA) transformer-based and other approaches. The code and pre-trained models are available at https://github.com/AlienZhang1996/S2WAT.
引用
收藏
页码:7024 / 7032
页数:9
相关论文
共 35 条
  • [1] ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows
    An, Jie
    Huang, Siyu
    Song, Yibing
    Dou, Dejing
    Liu, Wei
    Luo, Jiebo
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 862 - 871
  • [2] Bai Yushi, 2023, arXiv
  • [3] StyleBank: An Explicit Representation for Neural Image Style Transfer
    Chen, Dongdong
    Yuan, Lu
    Liao, Jing
    Yu, Nenghai
    Hua, Gang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2770 - 2779
  • [4] Chen Haibo, 2021, NeurIPS
  • [5] Cheng J., 2016, INEMNLP 2016 C EMPIR, P551, DOI [10.18653/v1/d16-1053, DOI 10.18653/V1/D16-1053.144.X]
  • [6] StyTr2: Image Style Transfer with Transformers
    Deng, Yingying
    Tang, Fan
    Dong, Weiming
    Ma, Chongyang
    Pan, Xingjia
    Wang, Lei
    Xu, Changsheng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11316 - 11326
  • [7] Deng YY, 2021, AAAI CONF ARTIF INTE, V35, P1210
  • [8] Arbitrary Style Transfer via Multi-Adaptation Network
    Deng, Yingying
    Tang, Fan
    Dong, Weiming
    Sun, Wen
    Huang, Feiyue
    Xu, Changsheng
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2719 - 2727
  • [9] Efros AA, 2001, COMP GRAPH, P341, DOI 10.1145/383259.383296
  • [10] Multiscale Vision Transformers
    Fan, Haoqi
    Xiong, Bo
    Mangalam, Karttikeya
    Li, Yanghao
    Yan, Zhicheng
    Malik, Jitendra
    Feichtenhofer, Christoph
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6804 - 6815