Balanced knowledge distillation for long-tailed learning

被引：43

作者：

Zhang, Shaoyu ^{[1
,2
]}

Chen, Chen ^{[1
,2
]}

Hu, Xiyuan ^{[3
]}

Peng, Silong ^{[1
,2
,4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] Nanjing Univ Sci & Technol, Nanjing, Peoples R China

[4] Beijing ViSystem Co Ltd, Beijing, Peoples R China

来源：

NEUROCOMPUTING | 2023年 / 527卷

基金：

美国国家科学基金会;

关键词：

Long-tailed learning; Knowledge distillation; Vision and text classification; SMOTE;

D O I：

10.1016/j.neucom.2023.01.063

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep models trained on long-tailed datasets exhibit unsatisfactory performance on tail classes. Existing methods usually modify the classification loss to increase the learning focus on tail classes, which unex-pectedly sacrifice the performance on head classes. In fact, this scheme leads to a contradiction between the two goals of long-tailed learning, i.e., learning generalizable representations and facilitating learning for tail classes. In this work, we explore knowledge distillation in long-tailed scenarios and propose a novel distillation framework, named Balanced Knowledge Distillation (BKD), to disentangle the contradic-tion between the two goals and achieve both simultaneously. Specifically, given a teacher model, we train the student model by minimizing the combination of an instance-balanced classification loss and a class-balanced distillation loss. The former benefits from the sample diversity and learns generalizable repre-sentation, while the latter considers the class priors and facilitates learning for tail classes. We conduct extensive experiments on several long-tailed benchmark datasets and demonstrate that the proposed BKD is an effective knowledge distillation framework in long-tailed scenarios, as well as a competitive method for long-tailed learning. Our source code is available: https://github.com/EricZsy/ BalancedKnowledgeDistillation.& COPY; 2023 Elsevier B.V. All rights reserved.

引用

页码：36 / 46

页数：11

共 64 条

[1]

[Anonymous], 2000, INT C MACH LEARN

[2] A systematic study of the class imbalance problem in convolutional neural networks [J].

Buda, Mateusz ;

Maki, Atsuto ;

Mazurowski, Maciej A. .

NEURAL NETWORKS, 2018, 106 :249-259

[3]

Byrd J, 2019, PR MACH LEARN RES, V97

[4] ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot [J].

Cai, Jiarui ;

Wang, Yizhou ;

Hwang, Jenq-Neng .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :112-121

[5]

Cao K., Advances in neural information processing systems, V32

[6] SMOTE: Synthetic minority over-sampling technique [J].

Chawla, Nitesh V. ;

Bowyer, Kevin W. ;

Hall, Lawrence O. ;

Kegelmeyer, W. Philip .

2002, American Association for Artificial Intelligence (16)

[7] A knowledge-guide hierarchical learning method for long-tailed image classification [J].

Chen, Qiong ;

Liu, Qingfa ;

Lin, Enlu .

NEUROCOMPUTING, 2021, 459 :408-418

[8] Feature Space Augmentation for Long-Tailed Data [J].

Chu, Peng ;

Bian, Xiao ;

Liu, Shaopeng ;

Ling, Haibin .

COMPUTER VISION - ECCV 2020, PT XXIX, 2020, 12374 :694-710

[9] Class-Balanced Loss Based on Effective Number of Samples [J].

Cui, Yin ;

Jia, Menglin ;

Lin, Tsung-Yi ;

Song, Yang ;

Belongie, Serge .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9260-9269

[10]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 6 7 →