Knowledge Augmentation for Distillation: A General and Effective Approach to Enhance Knowledge Distillation

被引：0

作者：

Tang, Yinan ^{[1
]}

Guo, Zhenhua ^{[1
]}

Wang, Li ^{[1
]}

Fan, Baoyu ^{[1
]}

Cao, Fang ^{[1
]}

Gao, Kai ^{[1
]}

Zhang, Hongwei ^{[1
]}

Li, Rengang ^{[1
]}

机构：

[1] IEIT Syst Co Ltd, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON EFFICIENT MULTIMEDIA COMPUTING UNDER LIMITED RESOURCES, EMCLR 2024 | 2024年

关键词：

Knowledge Distillation; Data Augmentation; Knowledge Augmentation; Metric learning; Image Classification;

D O I：

10.1145/3688863.3689569

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Knowledge Distillation (KD), which extracts knowledge from a well-performed large neural network (a.k.a teacher network) to guide the training of a small network (a.k.a student network), has emerged as a promising approach for transfer learning and model compression. Nonetheless, unlike previous KD works which focus on how to better transfer existing knowledge from the teacher network to the student network, we enhance KD by augmenting and distilling extra knowledge. In this paper, we propose Knowledge Augmentation for Distillation (KAD), which mines and transfers augmented knowledge by generating augmented samples. Besides, we further enhance KAD with a metric learning method called N-pair loss, which can make full use of the augmented samples and boost the compressed student network based on the N-pair structure. We perform extensive experiments on widely-used image benchmarks, and the experimental results show that our KAD can not only flexibly work together with various existing KD methods, but also achieve consistent improvements in terms of classification accuracy.

引用

页码：23 / 31

页数：9

共 50 条

[1]

Ba LJ, 2014, ADV NEUR IN, V27

[2]

Sau BB, 2016, Arxiv, DOI [arXiv:1610.09650, DOI 10.48550/ARXIV.1610.09650]

[3]

Chapelle O, 2001, ADV NEUR IN, V13, P416

[4]

Chen GB, 2017, ADV NEUR IN, V30

[5]

Chen H, 2024, Arxiv, DOI arXiv:2402.11960

[6] A knowledge-guide hierarchical learning method for long-tailed image classification [J].

Chen, Qiong ;

Liu, Qingfa ;

Lin, Enlu .

NEUROCOMPUTING, 2021, 459 :408-418

[7]

Cheng Y, 2020, Arxiv, DOI arXiv:1710.09282

[8]

Gong RH, 2024, AAAI CONF ARTIF INTE, P12190

[9]

Gong YC, 2014, Arxiv, DOI arXiv:1412.6115

[10] Online Knowledge Distillation via Collaborative Learning [J].

Guo, Qiushan ;

Wang, Xinjiang ;

Wu, Yichao ;

Yu, Zhipeng ;

Liang, Ding ;

Hu, Xiaolin ;

Luo, Ping .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11017-11026

← 1 2 3 4 5 →