The DNN learning method for few training data via knowledge transfer

被引:0
作者
Nigaki Y. [1 ]
Inoue K. [1 ]
Yoshioka M. [1 ]
机构
[1] Osaka Prefecture University, 1-1, Gakuen-cho, Naka-ku, Sakai, Osaka
来源
Inoue, Katsufumi (inoue@cs.osakafu-u.ac.jp) | 1600年 / Institute of Electrical Engineers of Japan卷 / 140期
关键词
Deep neural network; Knowledge distillation; Model compression; Transfer learning;
D O I
10.1541/ieejeiss.140.664
中图分类号
学科分类号
摘要
Deep Neural Network (DNN) models have a great deal of parameters. It allows DNN to obtain good performance, however it also causes some problems. The first one is that learning of huge parameters requires enormous learning data for training DNN. The second one is that high-spec devices are requested because learning of huge parameters is computational complexity. These problems prevent the installation of DNN for any real tasks. To solve these problems, we propose a new learning method of DNN by combining transfer learning and knowledge distillation. The characteristic point of our proposed method is that we learn the DNN parameters by applying the techniques mentioned above simultaneously, i.e., we transfer the feature map of teacher DNN to student DNN, which is smaller than teacher DNN. © 2020 The Institute of Electrical Engineers of Japan.
引用
收藏
页码:664 / 672
页数:8
相关论文
共 34 条
[1]  
Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition, (2014)
[2]  
Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A.L., Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected Crfs, (2014)
[3]  
Deng J., Berg A., Satheesh S., Su H., Khosla A., Fei-Fei L., ILSVRC-2012, (2012)
[4]  
Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)
[5]  
Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., Rabinovich A., Going deeper with convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, (2015)
[6]  
He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[7]  
Raudys S.J., Jain A.K., Small sample size effects in statistical pattern recognition: Recommendations for practitioners, IEEE Transactions on Pattern Analysis & Machine Intelligence, 3, pp. 252-264, (1991)
[8]  
Pan S.J., Yang Q., A survey on transfer learning, IEEE Transactions on Knowledge and Data Engineering, 22, 10, pp. 1345-1359, (2009)
[9]  
Bucilua C., Caruana R., Niculescu-Mizil A., Model compression, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535-541, (2006)
[10]  
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)