HAR: Hardness Aware Reweighting for Imbalanced Datasets

被引:5
作者
Duggal, Rahul [1 ]
Freitas, Scott [1 ]
Dhamnani, Sunny [1 ]
Chau, Duen Horng [1 ]
Sun, Jimeng [2 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
[2] Univ Illinois, Urbana, IL USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2021年
关键词
Class imbalance; neural networks; hardness; reweighting; DATA-SETS; SMOTE;
D O I
10.1109/BigData52589.2021.9671807
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance is a significant issue that causes neural networks to underfit to ther are classes. Traditional mitigation strategies include loss reshaping and data resampling which amount to increasing the loss contribution of minority classes and decreasing the loss contributed by the majority ones. However, by treating each example within a class equally, these methods lead to undesirable scenarios where hard-to-classify examples from the majority classes are down-weighted and easy-to-classify examples from the minority classes are up-weighted. We propose the Hardness Aware Reweighting (HAR) framework, which circumvents this issue by increasing the loss contribution of hard examples from both the majority and minority classes. This is achieved by augmenting a neural network with intermediate classifier branches to enable early-exiting during training. Experimental results on large-scale datasets demonstrate that HAR consistently improves state-of-the-art accuracy while saving up to 20% of inference FLOPS.
引用
收藏
页码:735 / 745
页数:11
相关论文
共 50 条
[1]  
[Anonymous], 2018, P INT C LEARN REPR
[2]  
[Anonymous], 2012, IEEE T KNOWL DATA EN
[3]  
Baccarelli E., 2020, INFORM SCIENCES
[4]  
Bengio Y., 2009, P 26 ANN INT C MACH, P41
[5]   A systematic study of the class imbalance problem in convolutional neural networks [J].
Buda, Mateusz ;
Maki, Atsuto ;
Mazurowski, Maciej A. .
NEURAL NETWORKS, 2018, 106 :249-259
[6]  
Cao KD, 2019, ADV NEUR IN, V32
[7]  
Chang Haw-Shiuan., 2017, Neural Information Processing Systems
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[10]   How to develop machine learning models for healthcare [J].
Chen, Po-Hsuan Cameron ;
Liu, Yun ;
Peng, Lily .
NATURE MATERIALS, 2019, 18 (05) :410-414