Variational Information Distillation for Knowledge Transfer

被引：488

作者：

Ahn, Sungsoo ^{[1
]}

Hu, Shell Xu ^{[2
]}

Damianou, Andreas ^{[3
]}

Lawrence, Neil D. ^{[3
]}

Dai, Zhenwen ^{[3
]}

机构：

[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[2] Ecole Ponts ParisTech, Champs Sur Marne, France

[3] Amazon, Cambridge, England

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00938

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transferring knowledge from a teacher neural network pretrained on the same or a similar task to a student neural network can significantly improve the performance of the student neural network. Existing knowledge transfer approaches match the activations or the corresponding handcrafted-features of the teacher and the student networks. We propose an information-theoretic framework for knowledge transfer which formulates knowledge transfer as maximizing the mutual information between the teacher and the student networks. We compare our method with existing knowledge transfer methods on both knowledge distillation and transfer learning tasks and show that our method consistently outperforms existing methods. We further demonstrate the strength of our method on knowledge transfer across heterogeneous network architectures by transferring knowledge from a convolutional neural network (CNN) to a multi-layer perceptron (MLP) on CIFAR-10. The resulting MLP significantly outperforms the-state-of-the-art methods and it achieves similar performance to the CNN with a single convolutional layer.

引用

页码：9155 / 9163

页数：9

共 31 条

[1]

[Anonymous], P 3 INT C LEARNING R

[2]

[Anonymous], 2018, ICML

[3]

[Anonymous], IM ALGORITHM VARIATI

[4]

[Anonymous], PROC CVPR IEEE

[5]

[Anonymous], 2010, A Survey on Transfer Learning

[6]

[Anonymous], GIFT KNOWLEDGE DISTI

[7]

[Anonymous], 2015, ARXIV151102580

[8]

[Anonymous], 2016, ICLR

[9]

[Anonymous], 2018, COMPUTER VISION PATT

[10]

[Anonymous], 2017, ICLR

← 1 2 3 4 →