Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

被引:0
作者
Wang, Zhendong [1 ,2 ]
Jiang, Yifan [1 ]
Zheng, Huangjie [1 ,2 ]
Wang, Peihao [1 ]
He, Pengcheng [2 ]
Wang, Zhangyang [1 ]
Chen, Weizhu [2 ]
Zhou, Mingyuan [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] Microsoft Azure AI, Austin, TX 78759 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which thus helps democratize diffusion model training to broader users. At the core of our innovations is a new conditional score function at the patch level, where the patch location in the original image is included as additional coordinate channels, while the patch size is randomized and diversified throughout training to encode the cross-region dependency at multiple scales. Sampling with our method is as easy as in the original diffusion model. Through Patch Diffusion, we could achieve >= 2x faster training, while maintaining comparable or better generation quality. Patch Diffusion meanwhile improves the performance of diffusion models trained on relatively small datasets, e.g., as few as 5,000 images to train from scratch. We achieve outstanding FID scores in line with state-of-the-art benchmarks: 1.77 on CelebA-64x64, 1.93 on AFHQv2-Wild-64x64, and 2.72 on ImageNet-256x256. We share our code and pre-trained models at https://github.com/Zhendong- Wang/Patch- Diffusion.
引用
收藏
页数:18
相关论文
共 50 条
[21]   Data-Efficient Performance Modeling via Pre-training [J].
Liu, Chunting ;
Baghdadi, Riyadh .
PROCEEDINGS OF THE 34TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, CC 2025, 2024, :48-59
[22]   Perception Prioritized Training of Diffusion Models [J].
Choi, Jooyoung ;
Lee, Jungbeom ;
Shin, Chaehun ;
Kim, Sungwon ;
Kim, Hyunwoo ;
Yoon, Sungroh .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11462-11471
[23]   Data-efficient Neuroevolution with Kernel-Based Surrogate Models [J].
Gaier, Adam ;
Asteroth, Alexander ;
Mouret, Jean-Baptiste .
GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, :85-92
[24]   Experimental data-efficient reinforcement learning with an ensemble of surrogate models [J].
Jiang, Jiazhou ;
Chen, Zhiyong .
Neural Networks, 2025, 192
[25]   Towards Trustworthy and Efficient Diffusion Models [J].
Vora, Jayneel .
PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2024, 2024, :647-651
[26]   DICE: Data-Efficient Clinical Event Extraction with Generative Models [J].
Ma, Mingyu Derek ;
Taylor, Alexander K. ;
Wang, Wei ;
Peng, Nanyun .
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, :15898-15917
[27]   Faster Is More Different: Mean-Field Dynamics of Innovation Diffusion [J].
Baek, Seung Ki ;
Durang, Xavier ;
Kim, Mina .
PLOS ONE, 2013, 8 (07)
[28]   Faster, More Accurate Quantification of Diffusion and Correlated Motions in Lipid Bilayers [J].
Urner, Tara M. ;
Claflin, Gwendolyn A. ;
Lerner, Michael G. ;
Kyvelou-Kokkaliaris, Rodoula .
BIOPHYSICAL JOURNAL, 2016, 110 (03) :568A-568A
[29]   TwinLab: a framework for data-efficient training of non-intrusive reduced-order models for digital twins [J].
Kannapinn, Maximilian ;
Schaefer, Michael ;
Weeger, Oliver .
ENGINEERING COMPUTATIONS, 2024,
[30]   Post Training Quantization Strategies for Diffusion Models [J].
Vora, Jayneel .
PUBLICATION OF THE 26TH ACM INTERNATIONAL CONFERENCE ON MOBILE HUMAN-COMPUTER INTERACTION, MOBILEHCI 2024 ADJUNCT PROCEEDINGS, 2024,