TOAN: Target-Oriented Alignment Network for Fine-Grained Image Categorization With Few Labeled Samples

被引:55
作者
Huang, Huaxi [1 ]
Zhang, Junjie [2 ]
Yu, Litao [1 ]
Zhang, Jian [1 ]
Wu, Qiang [1 ]
Xu, Chang [3 ]
机构
[1] Univ Technol Sydney, Fac Engn & Informat Technol, Sydney, NSW 2007, Australia
[2] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Key Lab Specialty Fiber Opt & Opt Access Networks, Joint Int Res Lab Specialty Fiber Opt & Adv Commu, Shanghai 200444, Peoples R China
[3] Univ Sydney, Sch Comp Sci, Sydney, NSW 2006, Australia
关键词
Feature extraction; Task analysis; Visualization; Training; Optical fibers; Computational modeling; Phase change materials; Fine-grained image classification; few-shot setting; second-order relation extraction; RECOGNITION;
D O I
10.1109/TCSVT.2021.3065693
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we study the fine-grained categorization problem under the few-shot setting, i.e., each fine-grained class only contains a few labeled examples, termed Fine-Grained Few-Shot classification (FGFS). The core predicament in FGFS is the high intra-class variance yet low inter-class fluctuations in the dataset. In traditional fine-grained classification, the high intra-class variance can be somewhat relieved by conducting the supervised training on the abundant labeled samples. However, with few labeled examples, it is hard for the FGFS model to learn a robust class representation with the significantly higher intra-class variance. Moreover, the inter- and intra-class variance are closely related. The significant intra-class variance in FGFS often aggravates the low inter-class variance issue. To address the above challenges, we propose a Target-Oriented Alignment Network (TOAN) to tackle the FGFS problem from both intra- and inter-class perspective. To reduce the intra-class variance, we propose a target-oriented matching mechanism to reformulate the spatial features of each support image to match the query ones in the embedding space. To enhance the inter-class discrimination, we devise discriminative fine-grained features by integrating local compositional concept representations with the global second-order pooling. We conducted extensive experiments on four public datasets for fine-grained categorization, and the results show the proposed TOAN obtains the state-of-the-art.
引用
收藏
页码:853 / 866
页数:14
相关论文
共 77 条
[1]  
[Anonymous], 2012, Advances in Neural Information Processing Systems
[2]   RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING [J].
BIEDERMAN, I .
PSYCHOLOGICAL REVIEW, 1987, 94 (02) :115-147
[3]   Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization [J].
Cai, Sijia ;
Zuo, Wangmeng ;
Zhang, Lei .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :511-520
[4]  
Chen W.Y., 2019, INT C LEARNING REPRE
[5]   Destruction and Construction Learning for Fine-grained Image Recognition [J].
Chen, Yue ;
Bai, Yalong ;
Zhang, Wei ;
Mei, Tao .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5152-5161
[6]   Kernel Pooling for Convolutional Neural Networks [J].
Cui, Yin ;
Zhou, Feng ;
Wang, Jiang ;
Liu, Xiao ;
Lin, Yuanqing ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3049-3058
[7]   DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition [J].
Engin, Melih ;
Wang, Lei ;
Zhou, Luping ;
Liu, Xinwang .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :629-645
[8]  
Farrell R, 2011, IEEE I CONF COMP VIS, P161, DOI 10.1109/ICCV.2011.6126238
[9]  
Finn C, 2017, PR MACH LEARN RES, V70
[10]   Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].
Fu, Jianlong ;
Zheng, Heliang ;
Mei, Tao .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484