An Analysis of the Interaction Between Transfer Learning Protocols in Deep Neural Networks

被引：1

作者：

Plested, Jo ^{[1
]}

Gedeon, Tom ^{[1
]}

机构：

[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia

来源：

NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I | 2019年 / 11953卷

关键词：

Transfer learning; Convolutional neural networks;

D O I：

10.1007/978-3-030-36708-4_26

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We extend work on the transferability of features in deep neural networks to explore the interaction between training hyperparameters, optimal number of layers to transfer and the size of a target dataset. We show that using the commonly adopted transfer learning protocols results in increased overfitting and significantly decreased accuracy compared to optimal protocols, particularly for very small target datasets. We demonstrate that there is a relationship between fine-tuning hyperparameters used and the optimal number of layers to transfer. Our research shows that if this relationship is not taken into account, the optimal number of layers to transfer to the target dataset will likely be estimated incorrectly. Best practice transfer learning protocols cannot be predicted from existing research that has analysed transfer learning under very specific conditions that are not universally applicable. Extrapolating transfer learning training settings from previous findings can in fact be counterintuitive, particularly in the case of smaller datasets. We present optimal transfer learning protocols for various target dataset sizes from very small to large when source and target datasets and tasks are similar. Our results show that using these settings results in a large increase in accuracy when compared to commonly used transfer learning protocols. These results are most significant with very small target datasets. We observed an increase in accuracy of 47.8% on our smallest dataset which comprised of only 10 training examples per class. These findings are important as they are likely to improve outcomes from past, current and future research in transfer learning. We expect that researchers will want to re-examine their experiments to incorporate our findings and to check the robustness of their existing results.

引用

页码：312 / 323

页数：12

共 19 条

[1] Agrawal P, 2014, LECT NOTES COMPUT SC, V8695, P329, DOI 10.1007/978-3-319-10584-0_22
[2] Factors of Transferability for a Generic ConvNet Representation
Azizpour, Hossein
Razavian, Ali Sharif
Sullivan, Josephine
Maki, Atsuto
Carlsson, Stefan
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (09) : 1790 - 1802
[3] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4] Donahue J, 2014, PR MACH LEARN RES, V32
[5] Rich feature hierarchies for accurate object detection and semantic segmentation
Girshick, Ross
Donahue, Jeff
Darrell, Trevor
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
[6] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[7] He Kaiming, 2018, ABS181108883 CORR
[8] Huh Minyoung, 2016, ARXIV
[9] Do Better ImageNet Models Transfer Better?
Kornblith, Simon
Shlens, Jonathon
Le, Quoc V.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2656 - 2666
[10] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90

← 1 2 →