Specific Expert Learning: Enriching Ensemble Diversity via Knowledge Distillation

被引：6

作者：

Kao, Wei-Cheng ^{[1
]}

Xie, Hong-Xia ^{[2
]}

Lin, Chih-Yang ^{[1
]}

Cheng, Wen-Huang ^{[2
,3
]}

机构：

[1] Yuan Ze Univ, Dept Elect Engn, Taoyuan 320, Taiwan

[2] Natl Yang Ming Chiao Tung Univ, Inst Elect, Hsinchu 300, Taiwan

[3] Natl Chung Hsing Univ, Artificial Intelligence & Data Sci Program, Taichung 400, Taiwan

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2023年 / 53卷 / 04期

关键词：

Predictive models; Diversity reception; Task analysis; Boosting; Visualization; MIMICs; Knowledge engineering; Deep learning; ensemble diversity; knowledge distillation (KD); NETWORK;

D O I：

10.1109/TCYB.2021.3125320

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, ensemble methods have shown sterling performance and gained popularity in visual tasks. However, the performance of an ensemble is limited by the paucity of diversity among the models. Thus, to enrich the diversity of the ensemble, we present the distillation approach--learning from experts (LFEs). Such method involves a novel knowledge distillation (KD) method that we present, specific expert learning (SEL), which can reduce class selectivity and improve the performance on specific weaker classes and overall accuracy. Through SEL, models can acquire different knowledge from distinct networks with various areas of expertise, and a highly diverse ensemble can be obtained afterward. Our experimental results demonstrate that, on CIFAR-10, the accuracy of the ResNet-32 increases 0.91% with SEL, and that the ensemble trained by SEL increases accuracy by 1.13%. Compared to state-of-the-art approaches, for example, DML only improves accuracy by 0.3% and 1.02% on single ResNet-32 and the ensemble, respectively. Furthermore, our proposed architecture also can be applied to ensemble distillation (ED), which applies KD on the ensemble model. In conclusion, our experimental results show that our proposed SEL not only improves the accuracy of a single classifier but also boosts the diversity of the ensemble model.

引用

页码：2494 / 2505

页数：12

共 52 条

[1] Banfield R. E., 2005, Information Fusion, V6, P49, DOI 10.1016/j.inffus.2004.04.005
[2] Bembom O, 2007, STAT APPL GENET MOL, V6
[3] Bengio Y, 2015, ICLR, P1, DOI DOI 10.48550/ARXIV.1412.6550
[4] Random forests
Breiman, L
[J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
[5] Chen DF, 2020, AAAI CONF ARTIF INTE, V34, P3430
[6] BeautyGlow: On-Demand Makeup Transfer Framework with Reversible Generative Network
Chen, Hung-Jen
Hui, Ka-Ming
Wang, Szu-Yu
Tsao, Li-Wu
Shuai, Hong-Han
Cheng, Wen-Huang
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10034 - 10042
[7] Cheng W.-H., 2020, ACM COMPUT SURV, V54, P1
[8] Template-Free Try-On Image Synthesis via Semantic-Guided Optimization
Chou, Chien-Lung
Chen, Chieh-Yun
Hsieh, Chia-Wei
Shuai, Hong-Han
Liu, Jiaying
Cheng, Wen-Huang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4584 - 4597
[9] A decision-theoretic generalization of on-line learning and an application to boosting
Freund, Y
Schapire, RE
[J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) : 119 - 139
[10] Freund Y., 1996, P ICML, V96, P148

← 1 2 3 4 5 6 →