Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

被引:68
作者
Somepalli, Gowthami [1 ]
Singla, Vasu [1 ]
Goldblum, Micah [2 ]
Geiping, Jonas [1 ]
Goldstein, Tom [1 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] NYU, New York, NY USA
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52729.2023.00586
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication. We also identify cases where diffusion models, including the popular Stable Diffusion model, blatantly copy from their training data. Project page: https://somepago.github.io/diffrep.html
引用
收藏
页码:6048 / 6058
页数:11
相关论文
共 71 条
[1]  
Alaa AM, 2022, PR MACH LEARN RES, P290
[2]  
Arpit D, 2017, PR MACH LEARN RES, V70
[3]   Neural Codes for Image Retrieval [J].
Babenko, Artem ;
Slesarev, Anton ;
Chigorin, Alexandr ;
Lempitsky, Victor .
COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 :584-599
[4]   On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition [J].
Bai, Ching-Yuan ;
Lin, Hsuan-Tien ;
Raffel, Colin ;
Kan, Wendy Chih-wen .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :2534-2542
[5]  
Bardes Adrien, 2022, ARXIV221001571
[6]  
Berman Maxim, 2019, ARXIV190205509
[7]  
Birhane Abeba, 2021, ARXIV211001963CS
[8]  
Carlini N, 2022, P IEEE S SECUR PRIV, P1897, DOI [10.1109/SP46214.2022.9833649, 10.1109/SP46214.2022.00090]
[9]  
Carlini N, 2021, PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, P2633
[10]  
Carlini Nicholas, 2022, ARXIV220207646CS