Task-Adaptive Multi-Source Representations for Few-Shot Image Recognition

被引:0
作者
Liu, Ge [1 ]
Zhang, Zhongqiang [1 ]
Fang, Xiangzhong [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
关键词
few-shot learning; image recognition; transfer learning; domain adaptation;
D O I
10.3390/info15060293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Conventional few-shot learning (FSL) mainly focuses on knowledge transfer from a single source dataset to a recognition scenario with only a few training samples available but still similar to the source domain. In this paper, we consider a more practical FSL setting where multiple semantically different datasets are available to address a wide range of FSL tasks, especially for some recognition scenarios beyond natural images, such as remote sensing and medical imagery. It can be referred to as multi-source cross-domain FSL. To tackle the problem, we propose a two-stage learning scheme, termed learning and adapting multi-source representations (LAMR). In the first stage, we propose a multi-head network to obtain efficient multi-domain representations, where all source domains share the same backbone except for the last parallel projection layers for domain specialization. We train the representations in a multi-task setting where each in-domain classification task is taken by a cosine classifier. In the second stage, considering that instance discrimination and class discrimination are crucial for robust recognition, we propose two contrastive objectives for adapting the pre-trained representations to be task-specialized on the few-shot data. Careful ablation studies verify that LAMR significantly improves representation transferability, showing consistent performance boosts. We also extend LAMR to single-source FSL by introducing a dataset-splitting strategy that equally splits one source dataset into sub-domains. The empirical results show that LAMR can achieve SOTA performance on the BSCD-FSL benchmark and competitive performance on mini-ImageNet, highlighting its versatility and effectiveness for FSL of both natural and specific imaging.
引用
收藏
页数:28
相关论文
共 77 条
  • [1] Afrasiyabi A., 2020, P COMP VIS ECCV 2020
  • [2] Bilen Hakan, 2017, Universal representations: The missing link between faces, text, planktons, and cat breeds
  • [3] Predicting the Generalization Ability of a Few-Shot Classifier
    Bontonou, Myriam
    Bethune, Louis
    Gripon, Vincent
    [J]. INFORMATION, 2021, 12 (01) : 1 - 19
  • [4] Multi-Dimensional Information Alignment in Different Modalities for Generalized Zero-Shot and Few-Shot Learning
    Cai, Jiyan
    Wu, Libing
    Wu, Dan
    Li, Jianxin
    Wu, Xianfeng
    [J]. INFORMATION, 2023, 14 (03)
  • [5] Chen Ting, 2019, 25 AMERICAS C INFORM
  • [6] Chen Wei-Yu, 2019, INT C LEARN REPR
  • [7] Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning
    Chen, Yinbo
    Liu, Zhuang
    Xu, Huijuan
    Darrell, Trevor
    Wang, Xiaolong
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9042 - 9051
  • [8] Chen Z, 2018, PR MACH LEARN RES, V80
  • [9] Describing Textures in the Wild
    Cimpoi, Mircea
    Maji, Subhransu
    Kokkinos, Iasonas
    Mohamed, Sammy
    Vedaldi, Andrea
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3606 - 3613
  • [10] Codella N, 2019, Arxiv, DOI arXiv:1902.03368