Task-wise attention guided part complementary learning for few-shot image classification

被引:55
作者
Cheng, Gong [1 ,2 ,3 ]
Li, Ruimin [1 ,2 ,3 ]
Lang, Chunbo [1 ,2 ,3 ]
Han, Junwei [2 ]
机构
[1] Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518057, Peoples R China
[2] Northwestern Polytech Univ, Sch Automat, Xian 710072, Peoples R China
[3] CETC Key Lab Aerosp Informat Applicat, Shijiazhuang 050081, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
few-shot learning; meta-learning; task-wise attention; part complementary learning; NETWORKS;
D O I
10.1007/s11432-020-3156-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A general framework to tackle the problem of few-shot learning is meta-learning, which aims to train a well-generalized meta-learner (or backbone network) to learn a base-learner for each future task with small training data. Although a lot of work has produced relatively good results, there are still some challenges for few-shot image classification. First, meta-learning is a learning problem over a collection of tasks and the meta-learner is usually shared among all tasks. To achieve image classification of novel classes in different tasks, it is needed to learn a base-learner for each task. Under the circumstances, how to make the base-learner specialized, and thus respond to different inputs in an extremely task-wise manner for different tasks is a big challenge at present. Second, classification network usually inclines to identify local regions from the most discriminative object parts rather than the whole objects for recognition, thereby resulting in incomplete feature representations. To address the first challenge, we propose a task-wise attention (TWA) module to guide the base-learner to extract task-specific image features. To address the second challenge, under the guidance of TWA, we propose a part complementary learning (PCL) module to extract and fuse the features of multiple complementary parts of target objects, and thus we can obtain more specific and complete information. In addition, the proposed TWA module and PCL module can be embedded into a unified network for end-to-end training. Extensive experiments on two commonly-used benchmark datasets and comparison with state-of-the-art methods demonstrate the effectiveness of our proposed method.
引用
收藏
页数:14
相关论文
共 53 条
[1]   LaSO: Label-Set Operations networks for multi-label few-shot learning [J].
Alfassy, Amit ;
Karlinsky, Leonid ;
Aides, Amit ;
Shtok, Joseph ;
Harary, Sivan ;
Feris, Rogerio ;
Giryes, Raja ;
Bronstein, Alex M. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6541-6550
[2]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[3]  
[Anonymous], 2014, Comput. Sci.
[4]  
[Anonymous], 2011, Technical Report
[5]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[6]  
Bertinetto L., 2019, 7 INT C LEARNING REP
[7]  
Chen W.-Y., 2019, The Fractional Laplacian
[8]   Image Deformation Meta-Networks for One-Shot Learning [J].
Chen, Zitian ;
Fu, Yanwei ;
Wang, Yu-Xiong ;
Ma, Lin ;
Liu, Wei ;
Hebert, Martial .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8672-8681
[9]  
Cheng G, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P649
[10]   Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection [J].
Cheng, Gong ;
Han, Junwei ;
Zhou, Peicheng ;
Xu, Dong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) :265-278