Distributionally Robust Loss for Long-Tailed Multi-label Image Classification

被引：0

作者：

Lin, Dekun ^{[1
,2
]}

Peng, Tailai ^{[1
,2
]}

Chen, Rui ^{[1
,2
]}

Xie, Xinran ^{[1
,2
]}

Qin, Xiaolin ^{[1
,2
]}

Cui, Zhe ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Chengdu Inst Comp Applicat, Chengdu 610213, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

来源：

COMPUTER VISION - ECCV 2024, PT XXXIII | 2025年 / 15091卷

关键词：

Long-tailed learning; Multi-label classification; Loss;

D O I：

10.1007/978-3-031-73414-4_24

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The binary cross-entropy (BCE) loss function is widely utilized in multi-label classification (MLC) tasks, treating each label independently. The log-sum-exp pairwise (LSEP) loss, which emphasizes higher logits for positive classes over negative ones within a sample and accounts for label dependencies, has demonstrated effectiveness for MLC. However, our experiments suggest that its performance in long-tailed multi-label classification (LTMLC) appears to be inferior to that of BCE. In this study, we investigate the impact of the log-sum-exp operation on recognition and explore optimization avenues. Our observations reveal two primary shortcomings of LSEP that lead to its poor performance in LTMLC: 1) the indiscriminate use of label dependencies without consideration of the distribution shift between training and test sets, and 2) the overconfidence in negative labels with features similar to those of positive labels. To mitigate these problems, we propose a distributionally robust loss (DR), which includes class-wise LSEP and a negative gradient constraint. Additionally, our findings indicate that the BCE-based loss is somewhat complementary to the LSEP-based loss, offering enhanced performance upon integration. Extensive experiments conducted on two LTMLC datasets, VOC-LT and COCO-LT, demonstrate the consistent effectiveness of our proposed method. Code: https://github.com/Kunmonkey/DR- Loss.

引用

页码：417 / 433

页数：17

共 40 条

[1] Balanced Product of Calibrated Experts for Long-Tailed Recognition
Aimar, Emanuel Sanchez
Jonnarth, Arvi
Felsberg, Michael
Kuhlmann, Marco
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19967 - 19977
[2] A systematic study of the class imbalance problem in convolutional neural networks
Buda, Mateusz
Maki, Atsuto
Mazurowski, Maciej A.
[J]. NEURAL NETWORKS, 2018, 106 : 249 - 259
[3] Byrd J, 2019, PR MACH LEARN RES, V97
[4] Cao KD, 2019, Arxiv, DOI arXiv:1906.07413
[5] Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
Chen, Tianshui
Xu, Muxin
Hui, Xiaolu
Wu, Hefeng
Lin, Liang
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 522 - 531
[6] Multi-Label Image Recognition with Graph Convolutional Networks
Chen, Zhao-Min
Wei, Xiu-Shen
Wang, Peng
Guo, Yanwen
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5172 - 5181
[7] Class-Balanced Loss Based on Effective Number of Samples
Cui, Yin
Jia, Menglin
Lin, Tsung-Yi
Song, Yang
Belongie, Serge
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9260 - 9269
[8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9] The PASCAL Visual Object Classes Challenge: A Retrospective
Everingham, Mark
Eslami, S. M. Ali
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
[10] Goyal P, 2018, Arxiv, DOI [arXiv:1706.02677, DOI 10.48550/ARXIV.1706.02677V2]

← 1 2 3 4 →