GazeFusion: Saliency-Guided Image Generation

被引:0
作者
Zhang, Yunxiang [1 ]
Wu, Nan [2 ]
Lin, Connor Z. [2 ]
Wetzstein, Gordon [2 ]
Sun, Qi [1 ]
机构
[1] NYU, Brooklyn, NY 11201 USA
[2] Stanford Univ, Stanford, CA USA
基金
美国国家科学基金会;
关键词
Human Visual Attention; Perceptual Computer Graphics; Controllable Image Generation; VISUAL-ATTENTION; ALLOCATION; MODEL;
D O I
10.1145/3694969
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Diffusion models offer unprecedented image generation power given just a text prompt. While emerging approaches for controlling diffusion models have enabled users to specify the desired spatial layouts of the generated content, they cannot predict or control where viewers will pay more attention due to the complexity of human vision. Recognizing the significance of attention-controllable image generation in practical applications, we present a saliency-guided framework to incorporate the data priors of human visual attention mechanisms into the generation process. Given a user-specified viewer attention distribution, our control module conditions a diffusion model to generate images that attract viewers' attention toward the desired regions. To assess the efficacy of our approach, we performed an eye-tracked user study and a large-scale model-based saliency analysis. The results evidence that both the cross-user eye gaze distributions and the saliency models' predictions align with the desired attention distributions. Lastly, we outline several applications, including interactive design of saliency guidance, attention suppression in unwanted regions, and adaptive generation for varied display/viewing conditions.
引用
收藏
页数:19
相关论文
共 66 条
  • [1] Deep Saliency Prior for Reducing Visual Distraction
    Aberman, Kfir
    He, Junfeng
    Gandelsman, Yossi
    Mosseri, Inbar
    Jacobs, David E.
    Kohlhoff, Kai
    Pritch, Yael
    Rubinstein, Michael
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19819 - 19828
  • [2] Attributes for Image Content That Attract Consumers' Attention to Advertisements
    Abu Bakar, Muhammad Helmi
    Desa, Mohd Asyiek Mat
    Mustafa, Muhizam
    [J]. WORLD CONFERENCE ON TECHNOLOGY, INNOVATION AND ENTREPRENEURSHIP, 2015, : 309 - 314
  • [3] [Anonymous], 2006, Graph-based visual saliency, DOI DOI 10.7551/MITPRESS/7503.003.0073
  • [4] [Anonymous], 2007, Journal of Vision
  • [5] Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
    Bain, Max
    Nagrani, Arsha
    Varol, Gul
    Zisserman, Andrew
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1708 - 1718
  • [6] Blattmann A, 2023, Arxiv, DOI arXiv:2311.15127
  • [7] Borji A, 2015, Arxiv, DOI arXiv:1505.03581
  • [8] Salient Object Detection: A Benchmark
    Borji, Ali
    Cheng, Ming-Ming
    Jiang, Huaizu
    Li, Jia
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5706 - 5722
  • [9] Bruce N. D. B., 2005, P ANN NEUR INF PROC, P155, DOI [10.5555/2976248.2976268, DOI 10.5555/2976248.2976268]
  • [10] Interesting objects are visually salient
    Elazary, Lior
    Itti, Laurent
    [J]. JOURNAL OF VISION, 2008, 8 (03):