A coarse-to-fine capsule network for fine-grained image categorization

被引：7

作者：

Lin, Zhongqi ^{[1
,2
]}

Jia, Jingdun ^{[2
]}

Huang, Feng ^{[3
]}

Gao, Wanlin ^{[1
,2
]}

机构：

[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China

[2] Minist Agr & Rural Affairs, Key Lab Agr Informatizat Standardizat, Beijing 100083, Peoples R China

[3] China Agr Univ, Coll Sci, Beijing 100083, Peoples R China

来源：

NEUROCOMPUTING | 2021年 / 456卷

基金：

中国国家自然科学基金;

关键词：

Capsule network (CapsNet); Fine-grained image classification; Coarse-to-fine attention; Increasingly specialized perception; MODEL;

D O I：

10.1016/j.neucom.2021.05.032

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained image categorization is challenging due to the subordinate categories within an entry-level category can only be distinguished by subtle discriminations. This necessitates localizing key (most dis-criminative) regions and extract domain-specific features alternately. Existing methods predominantly realize fine-grained categorization independently, while ignoring that representation learning and fore-ground localization can reinforce each other iteratively. Sharing the state-of-the-art performance of cap-sule encoding for abstract semantic representation, we formalize our pipeline as a coarse-to-fine capsule network (CTF-CapsNet). It consists of customized expert CapsNets arranged in each perception scale and region proposal networks (RPNs) between two adjacent scales. Their mutually motivated self-optimization can achieve increasingly specialized cross-utilization of object-level and component-level descriptions. The RPN zooms the areas to turn the attention to the most distinctive regions by concerning preceding informations learned by expert CapsNet for references, whilst a finer-scale model takes as feed an amplified attended patch from last scale. Overall, CTF-CapsNet is driven by three focal margin losses between label prediction and ground truth, and three regeneration losses between original input images/ feature maps and reconstructed images. Experiments demonstrate that without any prior knowledge or strongly-supervised supports (e.g., bounding-box/part annotations), CTF-CapsNet can deliver competitive categorization performance among state-of-the-arts, i.e., testing accuracy achieves 89.57%, 88.63%, 90.51%, and 91.53% on our hand-crafted rice growth image set and three public benchmarks, i.e., CUB Birds, Stanford Dogs, and Stanford Cars, respectively. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：200 / 219

页数：20

共 50 条

[41] Feature relocation network for fine-grained image classification
Zhao, Peng
Li, Yi
Tang, Baowei
Liu, Huiting
Yao, Sheng
NEURAL NETWORKS, 2023, 161 : 306 - 317
[42] Coarse-to-Fine Network for Crowd Counting
Sun, Zhiyuan
2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 1342 - 1346
[43] Similarity Comparisons for Interactive Fine-Grained Categorization
Wah, Catherine
Van Horn, Grant
Branson, Steve
Maji, Subhransu
Perona, Pietro
Belongie, Serge
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 859 - 866
[44] Vantage Feature Frames For Fine-Grained Categorization
Sfar, Asma Rejeb
Boujemaa, Nozha
Geman, Donald
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 835 - 842
[45] Multiple Granularity Descriptors for Fine-grained Categorization
Wang, Dequan
Shen, Zhiqiang
Shao, Jie
Zhang, Wei
Xue, Xiangyang
Zhang, Zheng
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2399 - 2406
[46] Multiscale attention dynamic aware network for fine-grained visual categorization
Ou, Jichu
Li, Wanyi
Huang, Jingmin
Huang, Xiaojie
Xie, Xuan
ELECTRONICS LETTERS, 2023, 59 (01)
[47] PFNet: a novel part fusion network for fine-grained visual categorization
Jingyun Liang
Jinlin Guo
Yanming Guo
Songyang Lao
Multimedia Tools and Applications, 2020, 79 : 33397 - 33416
[48] Increasingly Specialized Generative Adversarial Network for fine-grained visual categorization
Lin, Zhongqi
Gao, Wanlin
Huang, Feng
Jia, Jingdun
KNOWLEDGE-BASED SYSTEMS, 2021, 232
[49] Feathers Dataset for Fine-Grained Visual Categorization
Belko, Alina
Dobratulin, Konstantin
Kuznetsov, Andrey
THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
[50] PFNet: a novel part fusion network for fine-grained visual categorization
Liang, Jingyun
Guo, Jinlin
Guo, Yanming
Lao, Songyang
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33397 - 33416

← 1 2 3 4 5 →