Distributionally Robust Loss for Long-Tailed Multi-label Image Classification

被引:0
作者
Lin, Dekun [1 ,2 ]
Peng, Tailai [1 ,2 ]
Chen, Rui [1 ,2 ]
Xie, Xinran [1 ,2 ]
Qin, Xiaolin [1 ,2 ]
Cui, Zhe [1 ,2 ]
机构
[1] Chinese Acad Sci, Chengdu Inst Comp Applicat, Chengdu 610213, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
来源
COMPUTER VISION - ECCV 2024, PT XXXIII | 2025年 / 15091卷
关键词
Long-tailed learning; Multi-label classification; Loss;
D O I
10.1007/978-3-031-73414-4_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The binary cross-entropy (BCE) loss function is widely utilized in multi-label classification (MLC) tasks, treating each label independently. The log-sum-exp pairwise (LSEP) loss, which emphasizes higher logits for positive classes over negative ones within a sample and accounts for label dependencies, has demonstrated effectiveness for MLC. However, our experiments suggest that its performance in long-tailed multi-label classification (LTMLC) appears to be inferior to that of BCE. In this study, we investigate the impact of the log-sum-exp operation on recognition and explore optimization avenues. Our observations reveal two primary shortcomings of LSEP that lead to its poor performance in LTMLC: 1) the indiscriminate use of label dependencies without consideration of the distribution shift between training and test sets, and 2) the overconfidence in negative labels with features similar to those of positive labels. To mitigate these problems, we propose a distributionally robust loss (DR), which includes class-wise LSEP and a negative gradient constraint. Additionally, our findings indicate that the BCE-based loss is somewhat complementary to the LSEP-based loss, offering enhanced performance upon integration. Extensive experiments conducted on two LTMLC datasets, VOC-LT and COCO-LT, demonstrate the consistent effectiveness of our proposed method. Code: https://github.com/Kunmonkey/DR- Loss.
引用
收藏
页码:417 / 433
页数:17
相关论文
共 40 条
  • [1] Balanced Product of Calibrated Experts for Long-Tailed Recognition
    Aimar, Emanuel Sanchez
    Jonnarth, Arvi
    Felsberg, Michael
    Kuhlmann, Marco
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19967 - 19977
  • [2] A systematic study of the class imbalance problem in convolutional neural networks
    Buda, Mateusz
    Maki, Atsuto
    Mazurowski, Maciej A.
    [J]. NEURAL NETWORKS, 2018, 106 : 249 - 259
  • [3] Byrd J, 2019, PR MACH LEARN RES, V97
  • [4] Cao KD, 2019, Arxiv, DOI arXiv:1906.07413
  • [5] Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
    Chen, Tianshui
    Xu, Muxin
    Hui, Xiaolu
    Wu, Hefeng
    Lin, Liang
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 522 - 531
  • [6] Multi-Label Image Recognition with Graph Convolutional Networks
    Chen, Zhao-Min
    Wei, Xiu-Shen
    Wang, Peng
    Guo, Yanwen
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5172 - 5181
  • [7] Class-Balanced Loss Based on Effective Number of Samples
    Cui, Yin
    Jia, Menglin
    Lin, Tsung-Yi
    Song, Yang
    Belongie, Serge
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9260 - 9269
  • [8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [9] The PASCAL Visual Object Classes Challenge: A Retrospective
    Everingham, Mark
    Eslami, S. M. Ali
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136
  • [10] Goyal P, 2018, Arxiv, DOI [arXiv:1706.02677, DOI 10.48550/ARXIV.1706.02677V2]