Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities

被引:0
|
作者
Hosseini, Abdullah [1 ]
Serag, Ahmed [1 ]
机构
[1] Weill Cornell Med Qatar, AI Innovat Lab, Doha, Qatar
来源
FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2025年 / 7卷
关键词
synthetic data generation; clinical biomarkers; denoising diffusion models; medical imaging; Swin-transformer network;
D O I
10.3389/frai.2024.1454441
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Introduction The integration of recent technologies in medical imaging has become a cornerstone of modern healthcare, facilitating detailed analysis of internal anatomy and pathology. Traditional methods, however, often grapple with data-sharing restrictions due to privacy concerns. Emerging techniques in artificial intelligence offer innovative solutions to overcome these constraints, with synthetic data generation enabling the creation of realistic medical imaging datasets, but the preservation of critical hidden medical biomarkers is an open question.Methods This study employs state-of-the-art Denoising Diffusion Probabilistic Models integrated with a Swin-transformer-based network to generate synthetic medical data. Three distinct areas of medical imaging - radiology, ophthalmology, and histopathology - are explored. The quality of synthetic images is evaluated through a classifier trained to identify the preservation of medical biomarkers.Results The diffusion model effectively preserves key medical features, such as lung markings and retinal abnormalities, producing synthetic images closely resembling real data. Classifier performance demonstrates the reliability of synthetic data for downstream tasks, with F1 and AUC reaching 0.8-0.99.Discussion This work provides valuable insights into the potential of diffusion-based models for generating realistic, biomarker-preserving synthetic images across various medical imaging modalities. These findings highlight the potential of synthetic data to address challenges such as data scarcity and privacy concerns in clinical practice, research, and education.
引用
收藏
页数:9
相关论文
共 12 条
  • [1] Synthetic data generation by diffusion models
    Zhu, Jun
    NATIONAL SCIENCE REVIEW, 2024, 11 (08)
  • [2] Synthetic data generation by diffusion models
    Jun Zhu
    National Science Review, 2024, 11 (08) : 19 - 21
  • [3] Improved Generation of Synthetic Imaging Data Using Feature-Aligned Diffusion
    Nair, Lakshmi
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024, 2024, : 25 - 30
  • [4] Solar synthetic imaging: Introducing denoising diffusion probabilistic models on SDO/AIA data
    Ramunno, F. P.
    Hackstein, S.
    Kinakh, V.
    Drozdova, M.
    Quetant, G.
    Csillaghy, A.
    Voloshynovskiy, S.
    ASTRONOMY & ASTROPHYSICS, 2024, 686
  • [5] Enhancing Clinical Support for Breast Cancer with Deep Learning Models Using Synthetic Correlated Diffusion Imaging
    Tai, Chi-en Amy
    Gunraj, Hayden
    Hodzic, Nedim
    Flanagan, Nic
    Sabri, Ali
    Wong, Alexander
    APPLICATIONS OF MEDICAL ARTIFICIAL INTELLIGENCE, AMAI 2023, 2024, 14313 : 83 - 93
  • [6] Enhancing ML model accuracy for Digital VLSI circuits using diffusion models: A study on synthetic data generation
    Srivastava, Prasha
    Kumar, Pawan
    Abbas, Zia
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [7] Synthetic Data Generation using Diffusion Models for ML-based Lightpath Quality of Transmission Estimation Under Extreme Data Scarcity
    Andreoletti, Davide
    Rottondi, Cristina
    Ayoub, Omran
    Bianco, Andrea
    2024 24TH INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS, ICTON 2024, 2024,
  • [8] D-DDPM: Deep Denoising Diffusion Probabilistic Models for Lesion Segmentation and Data Generation in Ultrasound Imaging
    Alblwi, Abdalrahman
    Makkawy, Saleh
    Barner, Kenneth E.
    IEEE ACCESS, 2025, 13 : 41194 - 41209
  • [9] Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models
    Carrillo-Perez, Francisco
    Pizurica, Marija
    Zheng, Yuanning
    Nandi, Tarak Nath
    Madduri, Ravi
    Shen, Jeanne
    Gevaert, Olivier
    NATURE BIOMEDICAL ENGINEERING, 2024, 8 (05) : 320 - 332
  • [10] Combined CT Image Quality and Radiation Dose Monitoring Program Based On Patient Data to Assess Consistency of Clinical Imaging Across Scanner Models
    Christianson, O.
    Winslow, J.
    Samei, E.
    MEDICAL PHYSICS, 2014, 41 (06) : 558 - +