Multimodal Framework for Long-Tailed Recognition

被引:0
作者
Chen, Jian [1 ]
Zhao, Jianyin [1 ]
Gu, Jiaojiao [1 ]
Qin, Yufeng [1 ]
Ji, Hong [1 ]
机构
[1] Naval Aviat Univ, Coll Coastal Def Force, Yantai 264001, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 22期
关键词
long-tailed recognition; vision-language models; imbalanced classification;
D O I
10.3390/app142210572
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Long-tailed data distribution (i.e., minority classes occupy most of the data, while most classes have very few samples) is a common problem in image classification. In this paper, we propose a novel multimodal framework for long-tailed data recognition. In the first stage, long-tailed data are used for visual-semantic contrastive learning to obtain good features, while in the second stage, class-balanced data are used for classifier training. The proposed framework leverages the advantages of multimodal models and mitigates the problem of class imbalance in long-tailed data recognition. Experimental results demonstrate that the proposed framework achieves competitive performance on the CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018 datasets for image classification.
引用
收藏
页数:14
相关论文
共 30 条
  • [1] ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot
    Cai, Jiarui
    Wang, Yizhou
    Hwang, Jenq-Neng
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 112 - 121
  • [2] Cao KD, 2019, Arxiv, DOI arXiv:1906.07413
  • [3] Long-tail Detection with Effective Class-Margins
    Cho, Jang Hyun
    Krahenbuhl, Philipp
    [J]. COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 698 - 714
  • [4] Chou H.-P., 2020, P COMP VIS ECCV 2020, P95
  • [5] AutoAugment: Learning Augmentation Strategies from Data
    Cubuk, Ekin D.
    Zoph, Barret
    Mane, Dandelion
    Vasudevan, Vijay
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 113 - 123
  • [6] Class-Balanced Loss Based on Effective Number of Samples
    Cui, Yin
    Jia, Menglin
    Lin, Tsung-Yi
    Song, Yang
    Belongie, Serge
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9260 - 9269
  • [7] Joint representation and classifier learning for long-tailed image classification
    Guan, Qingji
    Li, Zhuangzhuang
    Zhang, Jiayu
    Huang, Yaping
    Zhao, Yao
    [J]. IMAGE AND VISION COMPUTING, 2023, 137
  • [8] Disentangling Label Distribution for Long-tailed Visual Recognition
    Hong, Youngkyu
    Han, Seungju
    Choi, Kwanghee
    Seo, Seokjun
    Kim, Beomsu
    Chang, Buru
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6622 - 6632
  • [9] Jamal MA, 2020, PROC CVPR IEEE, P7607, DOI 10.1109/CVPR42600.2020.00763
  • [10] Kang B., 2019, arXiv, DOI DOI 10.48550/ARXIV.1910.09217