Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning

被引:6
作者
Shu, Yang [1 ]
Cao, Zhangjie [1 ]
Gao, Jinghan [1 ]
Wang, Jianmin [1 ]
Yu, Philip S. [1 ]
Long, Mingsheng [1 ]
机构
[1] Tsinghua Univ, Sch Software, BNRist, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Training; Adaptation models; Data models; Feature extraction; Deep learning; Bridges; Few-shot learning; data efficiency; transferability; meta-learning; pre-training;
D O I
10.1109/TPAMI.2023.3319517
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focus respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.
引用
收藏
页码:15275 / 15291
页数:17
相关论文
共 99 条
[1]  
Adiwardana D, 2020, Arxiv, DOI [arXiv:2001.09977, 10.48550/arXiv.2001.09977]
[2]  
Afrasiyabi A., 2020, Persistent mixture model networks for few-shot image classification
[3]  
Allen KR, 2019, PR MACH LEARN RES, V97
[4]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[5]  
Bateni P, 2022, Arxiv, DOI [arXiv:2201.05151, DOI 10.48550/ARXIV.2201.05151, 10.48550/arXiv.2201.05151]
[6]   Enhancing Few-Shot Image Classification with Unlabelled Examples [J].
Bateni, Peyman ;
Barber, Jarred ;
van de Meent, Jan-Willem ;
Wood, Frank .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1597-1606
[7]   Improved Few-Shot Visual Classification [J].
Bateni, Peyman ;
Goyal, Raghav ;
Masrani, Vaden ;
Wood, Frank ;
Sigal, Leonid .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14481-14490
[8]  
Bengio Y., 1991, IJCNN-91-Seattle: International Joint Conference on Neural Networks (Cat. No.91CH3049-4), DOI 10.1109/IJCNN.1991.155621
[9]  
Bertinetto L., 2019, INT C LEARNING REPRE
[10]  
Blanchard G., 2011, Advances in Neural Information Processing Systems, P2178