Advances in diffusion models for image data augmentation: a review of methods, models, evaluation metrics and future research directions

被引：4

作者：

Alimisis, Panagiotis ^{[1
]}

Mademlis, Ioannis ^{[1
]}

Radoglou-Grammatikis, Panagiotis ^{[2
,3
]}

Sarigiannidis, Panagiotis ^{[2
]}

Papadopoulos, Georgios Th. ^{[1
]}

机构：

[1] Harokopio Univ Athens, Dept Informat & Telemat, Thiseos 70,Attiki, Athens 17676, Greece

[2] Univ Western Macedonia, Dept Elect & Comp Engn, Act Urban Planning Zone, Kozani 50150, Kozani, Greece

[3] K3Y, Vitosha Quarter,Bl 9, BG-1700 Sofia, Bulgaria

来源：

ARTIFICIAL INTELLIGENCE REVIEW | 2025年 / 58卷 / 04期

关键词：

Image data augmentation; Diffusion models; Generative artificial intelligence; Evaluation metrics;

D O I：

10.1007/s10462-025-11116-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image data augmentation constitutes a critical methodology in modern computer vision tasks, since it can facilitate towards enhancing the diversity and quality of training datasets; thereby, improving the performance and robustness of machine learning models in downstream tasks. In parallel, augmentation approaches can also be used for editing/modifying a given image in a context- and semantics-aware way. Diffusion Models (DMs), which comprise one of the most recent and highly promising classes of methods in the field of generative Artificial Intelligence (AI), have emerged as a powerful tool for image data augmentation, capable of generating realistic and diverse images by learning the underlying data distribution. The current study realizes a systematic, comprehensive and in-depth review of DM-based approaches for image augmentation, covering a wide range of strategies, tasks and applications. In particular, a comprehensive analysis of the fundamental principles, model architectures and training strategies of DMs is initially performed. Subsequently, a taxonomy of the relevant image augmentation methods is introduced, focusing on techniques regarding semantic manipulation, personalization and adaptation, and application-specific augmentation tasks. Then, performance assessment methodologies and respective evaluation metrics are analyzed. Finally, current challenges and future research directions in the field are discussed.

引用

页数：55

共 272 条

[1]

Ackermann J., 2022, arXiv

[2]

Agustsson E, 2017, ADV NEUR IN, V30

[3] Diffusion-Based Data Augmentation for Skin Disease Classification: Impact Across Original Medical Datasets to Fully Synthetic Images [J].

Akrout, Mohamed ;

Gyepesi, Balint ;

Hollo, Peter ;

Poor, Adrienn ;

Kineso, Blaga ;

Solis, Stephen ;

Cirone, Katrina ;

Kawahara, Jeremy ;

Slade, Dekker ;

Abid, Latif ;

Kovacs, Mate ;

Fazekas, Istvan .

DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 :99-109

[4]

Ali H, 2022, IR C ART INT COGN SC, P32

[5]

Arkhipkin V, 2024, Arxiv, DOI arXiv:2312.03511

[6] Image embedding for denoising generative models [J].

Asperti, Andrea ;

Evangelista, Davide ;

Marro, Samuele ;

Merizzi, Fabio .

ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (12) :14511-14533

[7]

atStabilityAI DL, 2023, DeepFloyd IF: a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding

[8] SpaText: Spatio-Textual Representation for Controllable Image Generation [J].

Avrahami, Omri ;

Hayes, Thomas ;

Gafni, Oran ;

Gupta, Sonal ;

Taigman, Yaniv ;

Parikh, Devi ;

Lischinski, Dani ;

Fried, Ohad ;

Yin, Xi .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :18370-18380

[9] Blended Latent Diffusion [J].

Avrahami, Omri ;

Fried, Ohad ;

Lischinski, Dani .

ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04)

[10] Blended Diffusion for Text-driven Editing of Natural Images [J].

Avrahami, Omri ;

Lischinski, Dani ;

Fried, Ohad .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :18187-18197

← 1 2 3 4 5 6 7 8 9 10 →