A Systematic Review of Synthetic Data Generation Techniques Using Generative AI

被引:13
|
作者
Goyal, Mandeep [1 ]
Mahmoud, Qusay H. [1 ]
机构
[1] Ontario Tech Univ, Dept Elect Comp & Software Engn, Oshawa, ON L1G 0C5, Canada
关键词
synthetic data; LLMs; GANs; VAEs; generative AI; neural networks; machine learning;
D O I
10.3390/electronics13173509
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Synthetic data are increasingly being recognized for their potential to address serious real-world challenges in various domains. They provide innovative solutions to combat the data scarcity, privacy concerns, and algorithmic biases commonly used in machine learning applications. Synthetic data preserve all underlying patterns and behaviors of the original dataset while altering the actual content. The methods proposed in the literature to generate synthetic data vary from large language models (LLMs), which are pre-trained on gigantic datasets, to generative adversarial networks (GANs) and variational autoencoders (VAEs). This study provides a systematic review of the various techniques proposed in the literature that can be used to generate synthetic data to identify their limitations and suggest potential future research areas. The findings indicate that while these technologies generate synthetic data of specific data types, they still have some drawbacks, such as computational requirements, training stability, and privacy-preserving measures which limit their real-world usability. Addressing these issues will facilitate the broader adoption of synthetic data generation techniques across various disciplines, thereby advancing machine learning and data-driven solutions.
引用
收藏
页数:38
相关论文
共 50 条
  • [1] Music Generation Using Deep Learning and Generative AI: A Systematic Review
    Mitra, Rohan
    Zualkernan, Imran
    IEEE ACCESS, 2025, 13 : 18079 - 18106
  • [2] Revolutionizing personalized medicine with generative AI: a systematic review
    Ghebrehiwet, Isaias
    Zaki, Nazar
    Damseh, Rafat
    Mohamad, Mohd Saberi
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (05)
  • [3] A Systematic Review for the Implication of Generative AI in Higher Education
    Al-Shabandar, Raghad
    Jaddoad, Ail
    Elwi, Taha A.
    Mohammed, A. H.
    Hussain, Abir Jaafar
    INFOCOMMUNICATIONS JOURNAL, 2024, 16 (03): : 31 - 42
  • [4] A Systematic Review of Generative AI for Teaching and Learning Practice
    Ogunleye, Bayode
    Zakariyyah, Kudirat Ibilola
    Ajao, Oluwaseun
    Olayinka, Olakunle
    Sharma, Hemlata
    EDUCATION SCIENCES, 2024, 14 (06):
  • [5] Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data
    Lautrup, Anton danholt
    Hyrup, Tobias
    Zimek, Arthur
    Schneider-kamp, Peter
    ACM COMPUTING SURVEYS, 2025, 57 (04)
  • [6] Diabetic retinopathy detection through generative AI techniques: A review
    Bansal, Vipin
    Jain, Amit
    Walia, Navpreet Kaur
    RESULTS IN OPTICS, 2024, 16
  • [7] Generation of Synthetic Data with Conditional Generative Adversarial Networks
    Vega-Marquez, Belen
    Rubio-Escudero, Cristina
    Nepomuceno-Chamorro, Isabel
    LOGIC JOURNAL OF THE IGPL, 2022, 30 (02) : 252 - 262
  • [8] A systematic review of current AI techniques used in the context of the SDGs
    Greif, Lucas
    Roeckel, Fabian
    Kimmig, Andreas
    Ovtcharova, Jivka
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH, 2025, 19 (01)
  • [9] Privacy Mechanisms and Evaluation Metrics for Synthetic Data Generation: A Systematic Review
    Osorio-Marulanda, Pablo A.
    Epelde, Gorka
    Hernandez, Mikel
    Isasa, Imanol
    Reyes, Nicolas Moreno
    Iraola, Andoni Beristain
    IEEE ACCESS, 2024, 12 : 88048 - 88074
  • [10] Generative models for synthetic data generation: application to pharmacokinetic/pharmacodynamic data
    Jiang, Yulun
    Garcia-Duran, Alberto
    Losada, Idris Bachali
    Girard, Pascal
    Terranova, Nadia
    JOURNAL OF PHARMACOKINETICS AND PHARMACODYNAMICS, 2024, 51 (06) : 877 - 885