Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning

被引：102

作者：

Rizve, Mamshad Nayeem ^{[1
]}

Khan, Salman ^{[2
]}

Khan, Fahad Shahbaz ^{[2
]}

Shah, Mubarak ^{[1
]}

机构：

[1] UCF, Ctr Res Comp Vis, Orlando, FL 32816 USA

[2] Mohamed bin Zayed Univ AI, Abu Dhabi, U Arab Emirates

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

D O I：

10.1109/CVPR46437.2021.01069

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many real-world problems, collecting a large number of labeled samples is infeasible. Few-shot learning (FSL) is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples. FSL tasks have been predominantly solved by leveraging the ideas from gradient-based meta-learning and metric learning approaches. However, recent works have demonstrated the significance of powerful feature representations with a simple embedding network that can outperform existing sophisticated FSL algorithms. In this work, we build on this insight and propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations. Equivariance or invariance has been employed standalone in the previous works; however, to the best of our knowledge, they have not been used jointly. Simultaneous optimization for both of these contrasting objectives allows the model to jointly learn features that are not only independent of the input transformation but also the features that encode the structure of geometric transformations. These complementary sets of features help generalize well to novel classes with only a few data samples. We achieve additional improvements by incorporating a novel self-supervised distillation objective. Our extensive experimentation shows that even without knowledge distillation our proposed method can outperform current state-of-the-art FSL methods on five popular benchmark datasets.

引用

页码：10831 / 10841

页数：11

共 81 条

[1]

Allen KR, 2019, PR MACH LEARN RES, V97

[2]

[Anonymous], 2018, Lect. Comput. Vis., DOI [DOI 10.2200/S00822ED1V01Y201712COV015, 10.2200/S00822ED1V01Y201712COV015]

[3]

[Anonymous], 2020, COMP VIS ECCV 2020 1, DOI DOI 10.33012/2020.17164

[4]

[Anonymous], 2019, P EUR C COMP VIS ECC, DOI DOI 10.1007/S13143-018-0064-5

[5]

Battaglia Peter W., 2018, Relational inductive biases, deep learning, and graph networks

[6]

Berthelot D, 2019, ADV NEUR IN, V32

[7]

Bertinetto L., 2019, 7 INT C LEARNING REP

[8] Albumentations: Fast and Flexible Image Augmentations [J].

Buslaev, Alexander ;

Iglovikov, Vladimir I. ;

Khvedchenya, Eugene ;

Parinov, Alex ;

Druzhinin, Mikhail ;

Kalinin, Alexandr A. .

INFORMATION, 2020, 11 (02)

[9] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[10]

Cashman M, 2017, 2017 IEEE/ACM 12TH INTERNATIONAL WORKSHOP ON SOFTWARE ENGINEERING FOR SCIENCE (SE4SCIENCE), P2, DOI [10.1109/SE4Science.2017.9, 10.1109/SE4Science.2017..9, 10.1109/se4science.2017.9]

← 1 2 3 4 5 6 7 8 9 →