Gan-based data augmentation to improve breast ultrasound and mammography mass classification

被引:15
作者
Jimenez-Gaona, Yuliana [1 ,3 ,5 ]
Carrion-Figueroa, Diana [2 ]
Lakshminarayanan, Vasudevan [3 ,4 ]
Rodriguez-Alvarez, Maria Jose [5 ]
机构
[1] Univ Tecn Particular Loja, Dept Quim & Ciencias Exactas, San Cayetano Alto S-N CP1101608, Loja, Ecuador
[2] Hosp Gen Sur Quito IESS, Calle Moraspungo & Pinllopata, Quito 170111, Ecuador
[3] Univ Waterloo, Sch Optometry & Vis Sci, Theoret & Expt Epistemol Lab, Waterloo, ON N2L 3G1, Canada
[4] Univ Waterloo, Dept Syst Design Engn Phys & Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
[5] Univ Politecn Valencia, Inst Instrumentac Imagen Mol I3M, E-46022 Valencia, Spain
关键词
Breast cancer; Data augmentation; Deep learning algorithms; Generative Adversarial Networks (GAN); Mammography; Ultrasound; GENERATIVE ADVERSARIAL NETWORK;
D O I
10.1016/j.bspc.2024.106255
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Data imbalance is a common problem in breast cancer diagnosis, to address this challenge, the research explores the use of Generative Adversarial Networks (GANs) to generate synthetic medical data. Various GAN methods, including Wasserstein GAN with Gradient Penalty (WGAN-GP), Cycle GAN, Conditional GAN, and Spectral Normalization GAN (SNGAN), were tested for data augmentation in breast regions of interest (ROIs) using mammography and ultrasound databases. The study employed real, synthetic, and hybrid ROIs (128x128 pixels) to train a Resnet network for classifying as benign (B) or malignant (M) classes. The quality and diversity of the synthetic data were assessed using several metrics: Fre <acute accent>chet Inception Distance (FID), Kernel Inception Distance (KID), Structural Similarity Index (SSIM), Multi -Scale SSIM (MS-SSIM), Blind Reference Image Spatial Quality Evaluator (BRISQUE), Naturalness Image Quality Evaluator (NIQE), and Perception -based Image Quality Evaluator (PIQE).Results revealed that the SNGAN model (FID = 52.89) was most effective for augmenting mammography data, while CGAN (FID = 116.03) excelled with ultrasound data. Cycle GAN and WGAN-GP, though demonstrating lower KID values, did not perform better than SNGAN and CGAN. The lower average MS-SSIM values suggested that SNGAN and CGAN produced a high diversity of synthetic images. However, lower SSIM, BRISQUE, NIQE, and PIQE values indicated poor quality in both real and synthetic images. Classification results showed high accuracy without data augmentation in both US (93.1 %B/94.9 %M) and mammography (80.9 %B/76.9 %M). The research concludes that preprocessing and characterizing ROIs by abnormality type is crucial to generate diverse synthetic data and improve accuracy in the classification process using combined GANs and CNN models.
引用
收藏
页数:18
相关论文
共 64 条
[1]  
Al-Dhabyani W, 2019, INT J ADV COMPUT SC, V10, P618
[2]  
Alqahtani Hamed, 2019, INT C INF TECHN APPL, V7
[3]   Breast Ultrasound Images Augmentation and Segmentation Using GAN with Identity Block and Modified U-Net 3+ [J].
Alruily, Meshrif ;
Said, Wael ;
Mostafa, Ayman Mohamed ;
Ezz, Mohamed ;
Elmezain, Mahmoud .
SENSORS, 2023, 23 (20)
[4]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[6]  
Baur C, 2018, Arxiv, DOI [arXiv:1804.04338, DOI 10.48550/ARXIV.1804.04338]
[7]   Pros and cons of GAN evaluation measures [J].
Borji, Ali .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 179 :41-65
[8]   Utilizing Amari-Alpha Divergence to Stabilize the Training of Generative Adversarial Networks [J].
Cai, Likun ;
Chen, Yanjie ;
Cai, Ning ;
Cheng, Wei ;
Wang, Hao .
ENTROPY, 2020, 22 (04)
[9]  
Cantero Lorenzo J., 2021, Master's Degree in Data Science
[10]   Data augmentation using MG-GAN for improved cancer classification on gene expression data [J].
Chaudhari, Poonam ;
Agrawal, Himanshu ;
Kotecha, Ketan .
SOFT COMPUTING, 2020, 24 (15) :11381-11391