A Systematic Review of Synthetic Data Generation Techniques Using Generative AI

被引:43
作者
Goyal, Mandeep [1 ]
Mahmoud, Qusay H. [1 ]
机构
[1] Ontario Tech Univ, Dept Elect Comp & Software Engn, Oshawa, ON L1G 0C5, Canada
关键词
synthetic data; LLMs; GANs; VAEs; generative AI; neural networks; machine learning;
D O I
10.3390/electronics13173509
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Synthetic data are increasingly being recognized for their potential to address serious real-world challenges in various domains. They provide innovative solutions to combat the data scarcity, privacy concerns, and algorithmic biases commonly used in machine learning applications. Synthetic data preserve all underlying patterns and behaviors of the original dataset while altering the actual content. The methods proposed in the literature to generate synthetic data vary from large language models (LLMs), which are pre-trained on gigantic datasets, to generative adversarial networks (GANs) and variational autoencoders (VAEs). This study provides a systematic review of the various techniques proposed in the literature that can be used to generate synthetic data to identify their limitations and suggest potential future research areas. The findings indicate that while these technologies generate synthetic data of specific data types, they still have some drawbacks, such as computational requirements, training stability, and privacy-preserving measures which limit their real-world usability. Addressing these issues will facilitate the broader adoption of synthetic data generation techniques across various disciplines, thereby advancing machine learning and data-driven solutions.
引用
收藏
页数:38
相关论文
共 50 条
[41]   The Impact of Generative AI on Cloud Data Security: A Systematic Study of Opportunities and Challenges [J].
Ruparel, Hardik ;
Daftary, Harshal ;
Singhai, Videet ;
Kumar, Pramod .
2024 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING, UCC, 2024, :185-188
[42]   Revolutionizing Visuals: The Role of Generative AI in Modern Image Generation [J].
Bansal, Gaurang ;
Nawal, Aditya ;
Chamola, Vinay ;
Herencsar, Norbert .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (11)
[43]   Harnessing Generative AI (GenAI) for Automated Feedback in Higher Education: A Systematic Review [J].
Lee, Sophia Soomin ;
Moore, Robert L. .
ONLINE LEARNING, 2024, 28 (03) :82-104
[44]   Systematic analysis of generative AI tools integration in academic research and peer review [J].
Salman, Husain Abdulrasool ;
Ahmad, Muhammad Aliif ;
Ibrahim, Roliana ;
Mahmood, Jamilah .
ONLINE JOURNAL OF COMMUNICATION AND MEDIA TECHNOLOGIES, 2025, 15 (01)
[45]   Implementing generative AI (GenAI) in higher education: A systematic review of case studies [J].
Belkina, Marina ;
Daniel, Scott ;
Nikolic, Sasha ;
Haque, Rezwanul ;
Lyden, Sarah ;
Neal, Peter ;
Grundy, Sarah ;
Hassan, Ghulam M. .
Computers and Education: Artificial Intelligence, 2025, 8
[46]   Generative AI-based predictive maintenance in aviation: a systematic literature review [J].
Khan, Zeeshan Ullah ;
Nasim, Bisma ;
Rasheed, Zeehasham .
CEAS Aeronautical Journal, 2025, 16 (02) :537-555
[47]   Generative AI and Academic Integrity in Higher Education: A Systematic Review and Research Agenda [J].
Bittle, Kyle ;
El-Gayar, Omar .
INFORMATION, 2025, 16 (04)
[48]   Structural analysis and design using generative AI [J].
Park, Moonsu ;
Bong, Gyeongeun ;
Kim, Jungro ;
Kim, Gihwan .
STRUCTURAL ENGINEERING AND MECHANICS, 2024, 91 (04) :393-401
[49]   Principles for advertising responsibly using generative AI [J].
Sands, Sean ;
Campbell, Colin ;
Ferraro, Carla ;
Demsar, Vlad ;
Rosengren, Sara ;
Farrell, Justine .
ORGANIZATIONAL DYNAMICS, 2024, 53 (02)
[50]   Variational Autoencoder Generative Adversarial Network for Synthetic Data Generation in Smart Home [J].
Razghandi, Mina ;
Zhou, Hao ;
Erol-Kantarci, Melike ;
Turgut, Damla .
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, :4781-4786