Unifying Layout Generation with a Decoupled Diffusion Model

被引:17
作者
Hui, Mude [1 ]
Zhang, Zhizheng [2 ]
Zhang, Xiaoyi [2 ]
Xie, Wenxuan [2 ]
Wang, Yuwang [3 ]
Lu, Yan [2 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Tsinghua Univ, Beijing, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
关键词
D O I
10.1109/CVPR52729.2023.00193
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Layout generation aims to synthesize realistic graphic scenes consisting of elements with different attributes including category, size, position, and between-element relation. It is a crucial task for reducing the burden on heavy-duty graphic design works for formatted scenes, e.g., publications, documents, and user interfaces (UIs). Diverse application scenarios impose a big challenge in unifying various layout generation subtasks, including conditional and unconditional generation. In this paper, we propose a Layout Diffusion Generative Model (LDGM) to achieve such unification with a single decoupled diffusion model. LDGM views a layout of arbitrary missing or coarse element attributes as an intermediate diffusion status from a completed layout. Since different attributes have their individual semantics and characteristics, we propose to decouple the diffusion processes for them to improve the diversity of training samples and learn the reverse process jointly to exploit global-scope contexts for facilitating generation. As a result, our LDGM can generate layouts either from scratch or conditional on arbitrary available attributes. Extensive qualitative and quantitative experiments demonstrate our proposed LDGM outperforms existing layout generation models in both functionality and performance.
引用
收藏
页码:1942 / 1951
页数:10
相关论文
共 31 条
[1]   Variational Transformer Networks for Layout Generation [J].
Arroyo, Diego Martin ;
Postels, Janis ;
Tombari, Federico .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13637-13647
[2]  
Austin J, 2021, ADV NEUR IN
[3]  
Chen Nanxin, 2020, INT C LEARN REPR
[4]   Rico: A Mobile App Dataset for Building Data-Driven Design Applications [J].
Deka, Biplab ;
Huang, Zifeng ;
Franzen, Chad ;
Hibschman, Joshua ;
Afergan, Daniel ;
Li, Yang ;
Nichols, Jeffrey ;
Kumar, Ranjitha .
UIST'17: PROCEEDINGS OF THE 30TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, 2017, :845-854
[5]  
Gong Shansan, 2022, arXiv preprint arXiv:2210.08933
[6]   Vector Quantized Diffusion Model for Text-to-Image Synthesis [J].
Gu, Shuyang ;
Chen, Dong ;
Bao, Jianmin ;
Wen, Fang ;
Zhang, Bo ;
Chen, Dongdong ;
Yuan, Lu ;
Guo, Baining .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :10686-10696
[7]  
Ho Jonathan., 2020, P 34 INT C NEURAL IN, P6840
[8]   Message from the Guest Editors of the Special Issue on Entry, Descent, and Landing of Tianwen-1-China's First Mission to Mars [J].
Huang, Xiangyu ;
Jiang, Yu ;
Zeng, Xiangyuan .
ASTRODYNAMICS, 2022, 6 (01) :1-1
[9]  
Jiang ZY, 2022, AAAI CONF ARTIF INTE, P1096
[10]   LayoutVAE: Stochastic Scene Layout Generation From a Label Set [J].
Jyothi, Akash Abdu ;
Durand, Thibaut ;
He, Jiawei ;
Sigal, Leonid ;
Mori, Greg .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9894-9903