Specific Expert Learning: Enriching Ensemble Diversity via Knowledge Distillation

被引:6
作者
Kao, Wei-Cheng [1 ]
Xie, Hong-Xia [2 ]
Lin, Chih-Yang [1 ]
Cheng, Wen-Huang [2 ,3 ]
机构
[1] Yuan Ze Univ, Dept Elect Engn, Taoyuan 320, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Inst Elect, Hsinchu 300, Taiwan
[3] Natl Chung Hsing Univ, Artificial Intelligence & Data Sci Program, Taichung 400, Taiwan
关键词
Predictive models; Diversity reception; Task analysis; Boosting; Visualization; MIMICs; Knowledge engineering; Deep learning; ensemble diversity; knowledge distillation (KD); NETWORK;
D O I
10.1109/TCYB.2021.3125320
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, ensemble methods have shown sterling performance and gained popularity in visual tasks. However, the performance of an ensemble is limited by the paucity of diversity among the models. Thus, to enrich the diversity of the ensemble, we present the distillation approach--learning from experts (LFEs). Such method involves a novel knowledge distillation (KD) method that we present, specific expert learning (SEL), which can reduce class selectivity and improve the performance on specific weaker classes and overall accuracy. Through SEL, models can acquire different knowledge from distinct networks with various areas of expertise, and a highly diverse ensemble can be obtained afterward. Our experimental results demonstrate that, on CIFAR-10, the accuracy of the ResNet-32 increases 0.91% with SEL, and that the ensemble trained by SEL increases accuracy by 1.13%. Compared to state-of-the-art approaches, for example, DML only improves accuracy by 0.3% and 1.02% on single ResNet-32 and the ensemble, respectively. Furthermore, our proposed architecture also can be applied to ensemble distillation (ED), which applies KD on the ensemble model. In conclusion, our experimental results show that our proposed SEL not only improves the accuracy of a single classifier but also boosts the diversity of the ensemble model.
引用
收藏
页码:2494 / 2505
页数:12
相关论文
共 52 条
  • [1] Banfield R. E., 2005, Information Fusion, V6, P49, DOI 10.1016/j.inffus.2004.04.005
  • [2] Bembom O, 2007, STAT APPL GENET MOL, V6
  • [3] Bengio Y, 2015, ICLR, P1, DOI DOI 10.48550/ARXIV.1412.6550
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Chen DF, 2020, AAAI CONF ARTIF INTE, V34, P3430
  • [6] BeautyGlow: On-Demand Makeup Transfer Framework with Reversible Generative Network
    Chen, Hung-Jen
    Hui, Ka-Ming
    Wang, Szu-Yu
    Tsao, Li-Wu
    Shuai, Hong-Han
    Cheng, Wen-Huang
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10034 - 10042
  • [7] Cheng W.-H., 2020, ACM COMPUT SURV, V54, P1
  • [8] Template-Free Try-On Image Synthesis via Semantic-Guided Optimization
    Chou, Chien-Lung
    Chen, Chieh-Yun
    Hsieh, Chia-Wei
    Shuai, Hong-Han
    Liu, Jiaying
    Cheng, Wen-Huang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4584 - 4597
  • [9] A decision-theoretic generalization of on-line learning and an application to boosting
    Freund, Y
    Schapire, RE
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) : 119 - 139
  • [10] Freund Y., 1996, P ICML, V96, P148