Task-Adaptive Multi-Source Representations for Few-Shot Image Recognition

被引：0

作者：

Liu, Ge ^{[1
]}

Zhang, Zhongqiang ^{[1
]}

Fang, Xiangzhong ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China

来源：

INFORMATION | 2024年 / 15卷 / 06期

关键词：

few-shot learning; image recognition; transfer learning; domain adaptation;

D O I：

10.3390/info15060293

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Conventional few-shot learning (FSL) mainly focuses on knowledge transfer from a single source dataset to a recognition scenario with only a few training samples available but still similar to the source domain. In this paper, we consider a more practical FSL setting where multiple semantically different datasets are available to address a wide range of FSL tasks, especially for some recognition scenarios beyond natural images, such as remote sensing and medical imagery. It can be referred to as multi-source cross-domain FSL. To tackle the problem, we propose a two-stage learning scheme, termed learning and adapting multi-source representations (LAMR). In the first stage, we propose a multi-head network to obtain efficient multi-domain representations, where all source domains share the same backbone except for the last parallel projection layers for domain specialization. We train the representations in a multi-task setting where each in-domain classification task is taken by a cosine classifier. In the second stage, considering that instance discrimination and class discrimination are crucial for robust recognition, we propose two contrastive objectives for adapting the pre-trained representations to be task-specialized on the few-shot data. Careful ablation studies verify that LAMR significantly improves representation transferability, showing consistent performance boosts. We also extend LAMR to single-source FSL by introducing a dataset-splitting strategy that equally splits one source dataset into sub-domains. The empirical results show that LAMR can achieve SOTA performance on the BSCD-FSL benchmark and competitive performance on mini-ImageNet, highlighting its versatility and effectiveness for FSL of both natural and specific imaging.

引用

页数：28

共 77 条

[1] Afrasiyabi A., 2020, P COMP VIS ECCV 2020
[2] Bilen Hakan, 2017, Universal representations: The missing link between faces, text, planktons, and cat breeds
[3] Predicting the Generalization Ability of a Few-Shot Classifier
Bontonou, Myriam
Bethune, Louis
Gripon, Vincent
[J]. INFORMATION, 2021, 12 (01) : 1 - 19
[4] Multi-Dimensional Information Alignment in Different Modalities for Generalized Zero-Shot and Few-Shot Learning
Cai, Jiyan
Wu, Libing
Wu, Dan
Li, Jianxin
Wu, Xianfeng
[J]. INFORMATION, 2023, 14 (03)
[5] Chen Ting, 2019, 25 AMERICAS C INFORM
[6] Chen Wei-Yu, 2019, INT C LEARN REPR
[7] Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning
Chen, Yinbo
Liu, Zhuang
Xu, Huijuan
Darrell, Trevor
Wang, Xiaolong
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9042 - 9051
[8] Chen Z, 2018, PR MACH LEARN RES, V80
[9] Describing Textures in the Wild
Cimpoi, Mircea
Maji, Subhransu
Kokkinos, Iasonas
Mohamed, Sammy
Vedaldi, Andrea
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3606 - 3613
[10] Codella N, 2019, Arxiv, DOI arXiv:1902.03368

← 1 2 3 4 5 6 7 8 →