Do We Train on Test Data? Purging CIFAR of Near-Duplicates

被引:33
作者
Barz, Bjoern [1 ]
Denzler, Joachim [1 ]
机构
[1] Friedrich Schiller Univ Jena, Comp Vis Grp, Ernst Abbe Pl 2, D-07743 Jena, Germany
关键词
image classification; deep learning; reproducibility; duplicates; IMAGE RETRIEVAL;
D O I
10.3390/jimaging6060041
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images from the test sets of these datasets have duplicates in the training set. These duplicates are easily recognizable by memorization and may, hence, bias the comparison of image recognition techniques regarding their generalization capability. To eliminate this bias, we provide the "fair CIFAR" (ciFAIR) dataset, where we replaced all duplicates in the test sets with new images sampled from the same domain. The training set remains unchanged, in order not to invalidate pre-trained models. We then re-evaluate the classification performance of various popular state-of-the-art CNN architectures on these new test sets to investigate whether recent research has overfitted to memorizing data instead of learning abstract concepts. We find a significant drop in classification accuracy of between 9% and 14% relative to the original performance on the duplicate-free test set. We make both the ciFAIR dataset and pre-trained models publicly available and furthermore maintain a leaderboard for tracking the state of the art.
引用
收藏
页数:8
相关论文
共 24 条
  • [11] Krizhevsky Alex, 2009, University of Toronto
  • [12] Exploring the Granularity of Sparsity in Convolutional Neural Networks
    Mao, Huizi
    Han, Song
    Pool, Jeff
    Li, Wenshuo
    Liu, Xingyu
    Wang, Yu
    Dally, William J.
    [J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1927 - 1934
  • [13] Real E, 2019, AAAI CONF ARTIF INTE, P4780
  • [14] Recht B., 2018, ARXIV180600451
  • [15] Learning with Average Precision: Training Image Retrieval with a Listwise Loss
    Revaud, Jerome
    Almazan, Jon
    Rezende, Rafael S.
    de Souza, Cesar Roberto
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5106 - 5115
  • [16] ImageNet Large Scale Visual Recognition Challenge
    Russakovsky, Olga
    Deng, Jia
    Su, Hao
    Krause, Jonathan
    Satheesh, Sanjeev
    Ma, Sean
    Huang, Zhiheng
    Karpathy, Andrej
    Khosla, Aditya
    Bernstein, Michael
    Berg, Alexander C.
    Fei-Fei, Li
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) : 211 - 252
  • [17] Content-based image retrieval at the end of the early years
    Smeulders, AWM
    Worring, M
    Santini, S
    Gupta, A
    Jain, R
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (12) : 1349 - 1380
  • [18] Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
    Sun, Chen
    Shrivastava, Abhinav
    Singh, Saurabh
    Gupta, Abhinav
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 843 - 852
  • [19] 80 million tiny images: A large data set for nonparametric object and scene recognition
    Torralba, Antonio
    Fergus, Rob
    Freeman, William T.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (11) : 1958 - 1970
  • [20] Wah C., 2011, CALTECH UCSD BIRDS 2