Improving classification accuracy using data augmentation on small data sets

被引:129
作者
Moreno-Barea, Francisco J. [1 ]
Jerez, Jose M. [1 ]
Franco, Leonardo [1 ]
机构
[1] Univ Malaga, Escuela Tecn Super Ingn Informat, Dept Lenguajes & Ciencias Comp, 35 Bulevar Louis Pasteur, Malaga, Spain
关键词
Deep Learning; Data augmentation; GAN; VAE; Unbalanced sets; NEURAL-NETWORKS;
D O I
10.1016/j.eswa.2020.113696
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation (DA) is a key element in the success of Deep Learning (DL) models, as its use can lead to better prediction accuracy values when large size data sets are used. DA was not very much used with earlier neural network models before 2012, and the reason might be related to the type of models and the size of the data sets used. We investigate in this work, applying several state-of-the-art models based on Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), the effect of DA when using small size data sets, analyzing the results in terms of the prediction accuracy obtained according to the different characteristics of the training samples (number of instances and features, and class unbalance degree). We further introduce modifications to the standard methods used to generate the synthetic samples to alter the class balance representation, and the overall results indicate that with some computational effort a significant increase in prediction accuracy can be obtained when small data sets are considered. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] [Anonymous], 2014, Advances in Neural Information Processing Systems
  • [3] [Anonymous], Unsupervised representation learning with deep convolutional generative adversarial networks
  • [4] [Anonymous], 2015, Nature, DOI [DOI 10.1038/NATURE14539, 10.1038/nature14539]
  • [5] [Anonymous], 2016, P 29 C NEUR INF PROC
  • [6] [Anonymous], ABS160903126 CORR
  • [7] [Anonymous], 2017, P IEEE C COMP VIS PA, DOI DOI 10.1109/CVPR.2017.19
  • [8] [Anonymous], 2015, NEURIPS
  • [9] [Anonymous], 2016, IEEE C COMP VIS PATT
  • [10] [Anonymous], 2014, ARXIV 1401 4082