Class-Balanced Loss Based on Effective Number of Samples

被引：1753

作者：

Cui, Yin ^{[1
,2
,5
]}

Jia, Menglin ^{[1
]}

Lin, Tsung-Yi ^{[3
]}

Song, Yang ^{[4
]}

Belongie, Serge ^{[1
,2
]}

机构：

[1] Cornell Univ, Ithaca, NY 14853 USA

[2] Cornell Tech, New York, NY 10044 USA

[3] Google Brain, Mountain View, CA USA

[4] Alphabet Inc, Mountain View, CA USA

[5] Google, Mountain View, CA 94043 USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00949

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of longtailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula (1-beta(n))/(1-beta), where n is the number of samples and beta is an element of[0, 1)is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

引用

页码：9260 / 9269

页数：10

共 50 条

[1]

[Anonymous], IEEE T NEURAL NETWOR

[2]

[Anonymous], 2000, P 17 INT C MACH LEAR

[3]

[Anonymous], 2017, CVPR

[4]

[Anonymous], P 3 INT C LEARNING R

[5]

[Anonymous], IEEE T KNOWLEDGE DAT

[6]

[Anonymous], IJCAI

[7]

[Anonymous], 2018, PAMI

[8]

[Anonymous], 1986, IEEE ACCESS

[9]

[Anonymous], 2016, CVPR

[10]

[Anonymous], J OPERATIONS RES SOC

← 1 2 3 4 5 →