HAR: Hardness Aware Reweighting for Imbalanced Datasets

被引：5

作者：

Duggal, Rahul ^{[1
]}

Freitas, Scott ^{[1
]}

Dhamnani, Sunny ^{[1
]}

Chau, Duen Horng ^{[1
]}

Sun, Jimeng ^{[2
]}

机构：

[1] Georgia Inst Technol, Atlanta, GA 30332 USA

[2] Univ Illinois, Urbana, IL USA

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2021年

关键词：

Class imbalance; neural networks; hardness; reweighting; DATA-SETS; SMOTE;

D O I：

10.1109/BigData52589.2021.9671807

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Class imbalance is a significant issue that causes neural networks to underfit to ther are classes. Traditional mitigation strategies include loss reshaping and data resampling which amount to increasing the loss contribution of minority classes and decreasing the loss contributed by the majority ones. However, by treating each example within a class equally, these methods lead to undesirable scenarios where hard-to-classify examples from the majority classes are down-weighted and easy-to-classify examples from the minority classes are up-weighted. We propose the Hardness Aware Reweighting (HAR) framework, which circumvents this issue by increasing the loss contribution of hard examples from both the majority and minority classes. This is achieved by augmenting a neural network with intermediate classifier branches to enable early-exiting during training. Experimental results on large-scale datasets demonstrate that HAR consistently improves state-of-the-art accuracy while saving up to 20% of inference FLOPS.

引用

页码：735 / 745

页数：11

共 50 条

[1]

[Anonymous], 2018, P INT C LEARN REPR

[2]

[Anonymous], 2012, IEEE T KNOWL DATA EN

[3]

Baccarelli E., 2020, INFORM SCIENCES

[4]

Bengio Y., 2009, P 26 ANN INT C MACH, P41

[5] A systematic study of the class imbalance problem in convolutional neural networks [J].

Buda, Mateusz ;

Maki, Atsuto ;

Mazurowski, Maciej A. .

NEURAL NETWORKS, 2018, 106 :249-259

[6]

Cao KD, 2019, ADV NEUR IN, V32

[7]

Chang Haw-Shiuan., 2017, Neural Information Processing Systems

[8] SMOTE: Synthetic minority over-sampling technique [J].

Chawla, Nitesh V. ;

Bowyer, Kevin W. ;

Hall, Lawrence O. ;

Kegelmeyer, W. Philip .

2002, American Association for Artificial Intelligence (16)

[9] SMOTEBoost: Improving prediction of the minority class in boosting [J].

Chawla, NV ;

Lazarevic, A ;

Hall, LO ;

Bowyer, KW .

KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119

[10] How to develop machine learning models for healthcare [J].

Chen, Po-Hsuan Cameron ;

Liu, Yun ;

Peng, Lily .

NATURE MATERIALS, 2019, 18 (05) :410-414

← 1 2 3 4 5 →