An Analysis of the Interaction Between Transfer Learning Protocols in Deep Neural Networks

被引:1
作者
Plested, Jo [1 ]
Gedeon, Tom [1 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
来源
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I | 2019年 / 11953卷
关键词
Transfer learning; Convolutional neural networks;
D O I
10.1007/978-3-030-36708-4_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We extend work on the transferability of features in deep neural networks to explore the interaction between training hyperparameters, optimal number of layers to transfer and the size of a target dataset. We show that using the commonly adopted transfer learning protocols results in increased overfitting and significantly decreased accuracy compared to optimal protocols, particularly for very small target datasets. We demonstrate that there is a relationship between fine-tuning hyperparameters used and the optimal number of layers to transfer. Our research shows that if this relationship is not taken into account, the optimal number of layers to transfer to the target dataset will likely be estimated incorrectly. Best practice transfer learning protocols cannot be predicted from existing research that has analysed transfer learning under very specific conditions that are not universally applicable. Extrapolating transfer learning training settings from previous findings can in fact be counterintuitive, particularly in the case of smaller datasets. We present optimal transfer learning protocols for various target dataset sizes from very small to large when source and target datasets and tasks are similar. Our results show that using these settings results in a large increase in accuracy when compared to commonly used transfer learning protocols. These results are most significant with very small target datasets. We observed an increase in accuracy of 47.8% on our smallest dataset which comprised of only 10 training examples per class. These findings are important as they are likely to improve outcomes from past, current and future research in transfer learning. We expect that researchers will want to re-examine their experiments to incorporate our findings and to check the robustness of their existing results.
引用
收藏
页码:312 / 323
页数:12
相关论文
共 19 条
  • [1] Agrawal P, 2014, LECT NOTES COMPUT SC, V8695, P329, DOI 10.1007/978-3-319-10584-0_22
  • [2] Factors of Transferability for a Generic ConvNet Representation
    Azizpour, Hossein
    Razavian, Ali Sharif
    Sullivan, Josephine
    Maki, Atsuto
    Carlsson, Stefan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (09) : 1790 - 1802
  • [3] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [4] Donahue J, 2014, PR MACH LEARN RES, V32
  • [5] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [6] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
  • [7] He Kaiming, 2018, ABS181108883 CORR
  • [8] Huh Minyoung, 2016, ARXIV
  • [9] Do Better ImageNet Models Transfer Better?
    Kornblith, Simon
    Shlens, Jonathon
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2656 - 2666
  • [10] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90