Graph Complemented Latent Representation for Few-Shot Image Classification

被引：38

作者：

Zhong, Xian ^{[1
,2
]}

Gu, Cheng ^{[3
]}

Ye, Mang ^{[4
]}

Huang, Wenxin ^{[5
]}

Lin, Chia-Wen ^{[6
,7
]}

机构：

[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China

[2] Peking Univ, Sch Elect Engn & Comp Sci, Beijing 100091, Peoples R China

[3] ZhongNeng Power Tech Dev Co Ltd, Beijing 100034, Peoples R China

[4] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China

[5] Hubei Univ, Sch Comp Sci & Informat Engn, Wuhan 430062, Peoples R China

[6] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 30013, Taiwan

[7] Natl Tsing Hua Univ, Inst Commun Engn, Hsinchu 30013, Taiwan

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Few-shot learning; graph network; meta-learning; representation deficiency; variational inference; NETWORK;

D O I：

10.1109/TMM.2022.3141886

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Few-shot learning is a tough topic to solve since obtaining a large number of training samples in real applications is challenging. It has attracted increasing attention recently. Meta-learning is a prominent way to address this issue, intending to adapt predictors as base-learners to new tasks swiftly. However, a key challenge of meta-learning is its lack of expressive capacity, which stems from the difficulty of extracting general information from a small number of training samples. As a result, the generalizability of meta-learners trained from high-dimensional parameter spaces is frequently limited. To learn a better representation, we propose a graph complemented latent representation (GCLR) network for few-shot image classification. In particular, we embed the representation into a latent space, in which the latent codes are reconstructed using variational information to enrich the representation. In this way, the latent representation can achieve better generalizability. Another benefit is that, because the latent space is formed using variational inference, it cooperates well with various base-learners, boosting robustness. To make full use of the relation between samples in each category, a graph neural network (GNN) is also incorporated to improve relation mining. Consequently, our end-to-end framework delivers competitive performance on three few-shot learning benchmarks for image classification.

引用

页码：1979 / 1990

页数：12

共 70 条

[1] Associative Alignment for Few-Shot Image Classification [J].

Afrasiyabi, Arman ;

Lalonde, Jean-Francois ;

Gagne, Christian .

COMPUTER VISION - ECCV 2020, PT V, 2020, 12350 :18-35

[2]

[Anonymous], 2020, MindSpore

[3] Improved Few-Shot Visual Classification [J].

Bateni, Peyman ;

Goyal, Raghav ;

Masrani, Vaden ;

Wood, Frank ;

Sigal, Leonid .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14481-14490

[4]

Bertinetto L., 2019, INT C LEARN REPR

[5]

Chen MT, 2020, AAAI CONF ARTIF INTE, V34, P10559

[6]

Chen Ting, 2019, PMLR

[7] Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification [J].

Chu, Wen-Hsuan ;

Li, Yu-Jhe ;

Chang, Jing-Cheng ;

Wang, Yu-Chiang Frank .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6244-6253

[8] On the algorithmic implementation of multiclass kernel-based vector machines [J].

Crammer, K ;

Singer, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :265-292

[9]

Defferrard M, 2016, ADV NEUR IN, V29

[10]

Dhillon G.S., 2020, ICLR

← 1 2 3 4 5 6 7 →