Increasingly Specialized Generative Adversarial Network for fine-grained visual categorization

被引：2

作者：

Lin, Zhongqi ^{[1
,2
]}

Gao, Wanlin ^{[1
,2
]}

Huang, Feng ^{[3
]}

Jia, Jingdun ^{[2
]}

机构：

[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China

[2] Minist Agr & Rural Affairs, Key Lab Agr Informatizat Standardizat, Beijing 100083, Peoples R China

[3] China Agr Univ, Coll Sci, Beijing 100083, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2021年 / 232卷

基金：

中国国家自然科学基金;

关键词：

Generative adversarial networks; Fine-grained visual categorization; Laplacian residual; Patch proposal; Increasingly specialized perception; MODEL; SEGMENTATION; RECOGNITION;

D O I：

10.1016/j.knosys.2021.107480

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained visual categorization is challenging because the subordinate categories within an entrylevel category can only be distinguished by subtle discriminations. This necessitates to localize key (most discriminative) regions and extract domain-specific features alternately, since implicit to finegrained specialization is the existence of an entry-category visual shared among all classes. Existing methods predominantly implement fine-grained categorization independently, while neglecting that patch proposal and discrimination extraction are mutually correlated and can reinforce each other in an increasingly specialized manner. In this work, we concretize the above pipeline as an Increasing Specialized Generative Adversarial Network (IS-GAN), which recursively shapes a coarse-to-fine representation. It is a three-scale framework consisting of two highlights: a three-player expert GAN at each scale for feature extraction, and a Patch Proposal Network (PPN) between two adjacent scales for target positioning. To better anatomize pixel-to-pixel correlations at various octaves, the Gaussian pyramid and Laplacian pyramid descriptions are also integrated in each GAN. The PPN zooms the areas to shift the focus on the most representative regions by taking previous prediction of classifier as a reference, whilst a finer scale network receives an amplified attended region from previous scale. Overall, IS-GAN is driven by three focal losses from GANs and a converged object-level loss. Experiments demonstrate that IS-GAN can simultaneously (1) deliver competitive categorization performance among state-of the-arts, i.e., validation accuracy achieves 92.23% and testing accuracy achieves 90.27%, and (2) recover fine-grained textures with high Peak Signal-to-Noise Ratios (PSNRs) (32.937) and Structural Similarities (SSIMs) (0.8607) from hand-crafted and public benchmarks. (C) 2021 Elsevier B.V. All rights reserved.

引用

页数：24

共 85 条

[1]

[Anonymous], 2014, P BMVC

[2]

[Anonymous], Efficient learning of sparse representations with an energy

[3]

[Anonymous], 2016, Mask-cnn: Localizing parts and selecting descriptors for fine-grained image recognition

[4]

[Anonymous], 2012, P WORLD AUT C WAC PU, DOI DOI 10.1080/10798587.2011.10643167

[5]

[Anonymous], 2011, TR2011001 CNS

[6]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[7] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].

Bell, Sean ;

Zitnick, C. Lawrence ;

Bala, Kavita ;

Girshick, Ross .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883

[8] POOF: Part-Based One-vs-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation [J].

Berg, Thomas ;

Belhumeur, Peter N. .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :955-962

[9]

Berthelot D., 2017, arXiv, DOI DOI 10.48550/ARXIV.1703.10717

[10] A Cascaded Part-Based System for Fine-Grained Vehicle Classification [J].

Biglari, Mohsen ;

Soleimani, Ali ;

Hassanpour, Hamid .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (01) :273-283

← 1 2 3 4 5 6 7 8 9 →