Specific Expert Learning: Enriching Ensemble Diversity via Knowledge Distillation

被引：6

作者：

Kao, Wei-Cheng ^{[1
]}

Xie, Hong-Xia ^{[2
]}

Lin, Chih-Yang ^{[1
]}

Cheng, Wen-Huang ^{[2
,3
]}

机构：

[1] Yuan Ze Univ, Dept Elect Engn, Taoyuan 320, Taiwan

[2] Natl Yang Ming Chiao Tung Univ, Inst Elect, Hsinchu 300, Taiwan

[3] Natl Chung Hsing Univ, Artificial Intelligence & Data Sci Program, Taichung 400, Taiwan

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2023年 / 53卷 / 04期

关键词：

Predictive models; Diversity reception; Task analysis; Boosting; Visualization; MIMICs; Knowledge engineering; Deep learning; ensemble diversity; knowledge distillation (KD); NETWORK;

D O I：

10.1109/TCYB.2021.3125320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, ensemble methods have shown sterling performance and gained popularity in visual tasks. However, the performance of an ensemble is limited by the paucity of diversity among the models. Thus, to enrich the diversity of the ensemble, we present the distillation approach--learning from experts (LFEs). Such method involves a novel knowledge distillation (KD) method that we present, specific expert learning (SEL), which can reduce class selectivity and improve the performance on specific weaker classes and overall accuracy. Through SEL, models can acquire different knowledge from distinct networks with various areas of expertise, and a highly diverse ensemble can be obtained afterward. Our experimental results demonstrate that, on CIFAR-10, the accuracy of the ResNet-32 increases 0.91% with SEL, and that the ensemble trained by SEL increases accuracy by 1.13%. Compared to state-of-the-art approaches, for example, DML only improves accuracy by 0.3% and 1.02% on single ResNet-32 and the ensemble, respectively. Furthermore, our proposed architecture also can be applied to ensemble distillation (ED), which applies KD on the ensemble model. In conclusion, our experimental results show that our proposed SEL not only improves the accuracy of a single classifier but also boosts the diversity of the ensemble model.

引用

页码：2494 / 2505

页数：12

共 52 条

[41] Similarity-Preserving Knowledge Distillation
Tung, Frederick
Mori, Greg
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1365 - 1374
[42] DeepVID: Deep Visual Interpretation and Diagnosis for Image Classifiers via Knowledge Distillation
Wang, Junpeng
Gou, Liang
Zhang, Wei
Yang, Hao
Shen, Han-Wei
[J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (06) : 2168 - 2180
[43] Wang K., 2019, INT C LEARNING REPRE
[44] STACKED GENERALIZATION
WOLPERT, DH
[J]. NEURAL NETWORKS, 1992, 5 (02) : 241 - 259
[45] AU-assisted Graph Attention Convolutional Network for Micro-Expression Recognition
Xie, Hong-Xia
Lo, Ling
Shuai, Hong-Han
Cheng, Wen-Huang
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2871 - 2880
[46] Learning from Multiple Teacher Networks
You, Shan
Xu, Chang
Xu, Chao
Tao, Dacheng
[J]. KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1285 - 1294
[47] ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Zhang, Xiangyu
Zhou, Xinyu
Lin, Mengxiao
Sun, Ran
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6848 - 6856
[48] Efficient Person Search via Expert-Guided Knowledge Distillation
Zhang, Yaqing
Li, Xi
Zhang, Zhongfei
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (10) : 5093 - 5104
[49] Deep Mutual Learning
Zhang, Ying
Xiang, Tao
Hospedales, Timothy M.
Lu, Huchuan
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4320 - 4328
[50] Highlight Every Step: Knowledge Distillation via Collaborative Teaching
Zhao, Haoran
Sun, Xin
Dong, Junyu
Chen, Changrui
Dong, Zihe
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2070 - 2081

← 1 2 3 4 5 6 →