Regularizing Class-wise Predictions via Self-knowledge Distillation

被引:224
作者
Yun, Sukmin [1 ]
Park, Jongjin [1 ]
Lee, Kimin [1 ,2 ]
Shin, Jinwoo [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
关键词
NEURAL-NETWORKS;
D O I
10.1109/CVPR42600.2020.01389
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting. To mitigate the issue, we propose a new regularization method that penalizes the predictive distribution between similar samples. In particular, we distill the predictive distribution between different samples of the same label during training. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network (i.e., a self-knowledge distillation) by forcing it to produce more meaningful and consistent predictions in a class-wise manner. Consequently, it mitigates overconfident predictions and reduces intra-class variations. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve not only the generalization ability but also the calibration performance of modern convolutional neural networks.
引用
收藏
页码:13873 / 13882
页数:10
相关论文
共 58 条
[1]   Variational Information Distillation for Knowledge Transfer [J].
Ahn, Sungsoo ;
Hu, Shell Xu ;
Damianou, Andreas ;
Lawrence, Neil D. ;
Dai, Zhenwen .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9155-9163
[2]  
An H, 2018, PROCEEDINGS OF THE 10TH (2018) INTERNATIONAL CONFERENCE ON FINANCIAL RISK AND CORPORATE FINANCE MANAGEMENT (FRCFM), P8
[3]  
An YH, 2008, LIFE SCI J, V5, P1
[4]  
[Anonymous], 2019, NeurIPS
[5]  
[Anonymous], 2018, NEURIPS
[6]  
[Anonymous], 2016, ICLR
[7]  
[Anonymous], 2017, CVPR
[8]  
[Anonymous], 2018, ARXIV
[9]  
[Anonymous], 2018, ICML
[10]  
[Anonymous], 2016, ICML