Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning

被引：6

作者：

Shu, Yang ^{[1
]}

Cao, Zhangjie ^{[1
]}

Gao, Jinghan ^{[1
]}

Wang, Jianmin ^{[1
]}

Yu, Philip S. ^{[1
]}

Long, Mingsheng ^{[1
]}

机构：

[1] Tsinghua Univ, Sch Software, BNRist, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Training; Adaptation models; Data models; Feature extraction; Deep learning; Bridges; Few-shot learning; data efficiency; transferability; meta-learning; pre-training;

D O I：

10.1109/TPAMI.2023.3319517

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focus respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.

引用

页码：15275 / 15291

页数：17

共 99 条

[1]

Adiwardana D, 2020, Arxiv, DOI [arXiv:2001.09977, 10.48550/arXiv.2001.09977]

[2]

Afrasiyabi A., 2020, Persistent mixture model networks for few-shot image classification

[3]

Allen KR, 2019, PR MACH LEARN RES, V97

[4]

Andrychowicz M, 2016, ADV NEUR IN, V29

[5]

Bateni P, 2022, Arxiv, DOI [arXiv:2201.05151, DOI 10.48550/ARXIV.2201.05151, 10.48550/arXiv.2201.05151]

[6] Enhancing Few-Shot Image Classification with Unlabelled Examples [J].

Bateni, Peyman ;

Barber, Jarred ;

van de Meent, Jan-Willem ;

Wood, Frank .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1597-1606

[7] Improved Few-Shot Visual Classification [J].

Bateni, Peyman ;

Goyal, Raghav ;

Masrani, Vaden ;

Wood, Frank ;

Sigal, Leonid .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14481-14490

[8]

Bengio Y., 1991, IJCNN-91-Seattle: International Joint Conference on Neural Networks (Cat. No.91CH3049-4), DOI 10.1109/IJCNN.1991.155621

[9]

Bertinetto L., 2019, INT C LEARNING REPRE

[10]

Blanchard G., 2011, Advances in Neural Information Processing Systems, P2178

← 1 2 3 4 5 6 7 8 9 10 →