GazeFusion: Saliency-Guided Image Generation

被引：0

作者：

Zhang, Yunxiang ^{[1
]}

Wu, Nan ^{[2
]}

Lin, Connor Z. ^{[2
]}

Wetzstein, Gordon ^{[2
]}

Sun, Qi ^{[1
]}

机构：

[1] NYU, Brooklyn, NY 11201 USA

[2] Stanford Univ, Stanford, CA USA

来源：

ACM TRANSACTIONS ON APPLIED PERCEPTION | 2024年 / 21卷 / 04期

基金：

美国国家科学基金会;

关键词：

Human Visual Attention; Perceptual Computer Graphics; Controllable Image Generation; VISUAL-ATTENTION; ALLOCATION; MODEL;

D O I：

10.1145/3694969

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Diffusion models offer unprecedented image generation power given just a text prompt. While emerging approaches for controlling diffusion models have enabled users to specify the desired spatial layouts of the generated content, they cannot predict or control where viewers will pay more attention due to the complexity of human vision. Recognizing the significance of attention-controllable image generation in practical applications, we present a saliency-guided framework to incorporate the data priors of human visual attention mechanisms into the generation process. Given a user-specified viewer attention distribution, our control module conditions a diffusion model to generate images that attract viewers' attention toward the desired regions. To assess the efficacy of our approach, we performed an eye-tracked user study and a large-scale model-based saliency analysis. The results evidence that both the cross-user eye gaze distributions and the saliency models' predictions align with the desired attention distributions. Lastly, we outline several applications, including interactive design of saliency guidance, attention suppression in unwanted regions, and adaptive generation for varied display/viewing conditions.

引用

页数：19

共 66 条

[1] Deep Saliency Prior for Reducing Visual Distraction
Aberman, Kfir
He, Junfeng
Gandelsman, Yossi
Mosseri, Inbar
Jacobs, David E.
Kohlhoff, Kai
Pritch, Yael
Rubinstein, Michael
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19819 - 19828
[2] Attributes for Image Content That Attract Consumers' Attention to Advertisements
Abu Bakar, Muhammad Helmi
Desa, Mohd Asyiek Mat
Mustafa, Muhizam
[J]. WORLD CONFERENCE ON TECHNOLOGY, INNOVATION AND ENTREPRENEURSHIP, 2015, : 309 - 314
[3] [Anonymous], 2006, Graph-based visual saliency, DOI DOI 10.7551/MITPRESS/7503.003.0073
[4] [Anonymous], 2007, Journal of Vision
[5] Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Bain, Max
Nagrani, Arsha
Varol, Gul
Zisserman, Andrew
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1708 - 1718
[6] Blattmann A, 2023, Arxiv, DOI arXiv:2311.15127
[7] Borji A, 2015, Arxiv, DOI arXiv:1505.03581
[8] Salient Object Detection: A Benchmark
Borji, Ali
Cheng, Ming-Ming
Jiang, Huaizu
Li, Jia
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5706 - 5722
[9] Bruce N. D. B., 2005, P ANN NEUR INF PROC, P155, DOI [10.5555/2976248.2976268, DOI 10.5555/2976248.2976268]
[10] Interesting objects are visually salient
Elazary, Lior
Itti, Laurent
[J]. JOURNAL OF VISION, 2008, 8 (03):

← 1 2 3 4 5 6 7 →