A coarse-to-fine capsule network for fine-grained image categorization

被引:7
|
作者
Lin, Zhongqi [1 ,2 ]
Jia, Jingdun [2 ]
Huang, Feng [3 ]
Gao, Wanlin [1 ,2 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Key Lab Agr Informatizat Standardizat, Beijing 100083, Peoples R China
[3] China Agr Univ, Coll Sci, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Capsule network (CapsNet); Fine-grained image classification; Coarse-to-fine attention; Increasingly specialized perception; MODEL;
D O I
10.1016/j.neucom.2021.05.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained image categorization is challenging due to the subordinate categories within an entry-level category can only be distinguished by subtle discriminations. This necessitates localizing key (most dis-criminative) regions and extract domain-specific features alternately. Existing methods predominantly realize fine-grained categorization independently, while ignoring that representation learning and fore-ground localization can reinforce each other iteratively. Sharing the state-of-the-art performance of cap-sule encoding for abstract semantic representation, we formalize our pipeline as a coarse-to-fine capsule network (CTF-CapsNet). It consists of customized expert CapsNets arranged in each perception scale and region proposal networks (RPNs) between two adjacent scales. Their mutually motivated self-optimization can achieve increasingly specialized cross-utilization of object-level and component-level descriptions. The RPN zooms the areas to turn the attention to the most distinctive regions by concerning preceding informations learned by expert CapsNet for references, whilst a finer-scale model takes as feed an amplified attended patch from last scale. Overall, CTF-CapsNet is driven by three focal margin losses between label prediction and ground truth, and three regeneration losses between original input images/ feature maps and reconstructed images. Experiments demonstrate that without any prior knowledge or strongly-supervised supports (e.g., bounding-box/part annotations), CTF-CapsNet can deliver competitive categorization performance among state-of-the-arts, i.e., testing accuracy achieves 89.57%, 88.63%, 90.51%, and 91.53% on our hand-crafted rice growth image set and three public benchmarks, i.e., CUB Birds, Stanford Dogs, and Stanford Cars, respectively. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:200 / 219
页数:20
相关论文
共 50 条
  • [41] Feature relocation network for fine-grained image classification
    Zhao, Peng
    Li, Yi
    Tang, Baowei
    Liu, Huiting
    Yao, Sheng
    NEURAL NETWORKS, 2023, 161 : 306 - 317
  • [42] Coarse-to-Fine Network for Crowd Counting
    Sun, Zhiyuan
    2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 1342 - 1346
  • [43] Similarity Comparisons for Interactive Fine-Grained Categorization
    Wah, Catherine
    Van Horn, Grant
    Branson, Steve
    Maji, Subhransu
    Perona, Pietro
    Belongie, Serge
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 859 - 866
  • [44] Vantage Feature Frames For Fine-Grained Categorization
    Sfar, Asma Rejeb
    Boujemaa, Nozha
    Geman, Donald
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 835 - 842
  • [45] Multiple Granularity Descriptors for Fine-grained Categorization
    Wang, Dequan
    Shen, Zhiqiang
    Shao, Jie
    Zhang, Wei
    Xue, Xiangyang
    Zhang, Zheng
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2399 - 2406
  • [46] Multiscale attention dynamic aware network for fine-grained visual categorization
    Ou, Jichu
    Li, Wanyi
    Huang, Jingmin
    Huang, Xiaojie
    Xie, Xuan
    ELECTRONICS LETTERS, 2023, 59 (01)
  • [47] PFNet: a novel part fusion network for fine-grained visual categorization
    Jingyun Liang
    Jinlin Guo
    Yanming Guo
    Songyang Lao
    Multimedia Tools and Applications, 2020, 79 : 33397 - 33416
  • [48] Increasingly Specialized Generative Adversarial Network for fine-grained visual categorization
    Lin, Zhongqi
    Gao, Wanlin
    Huang, Feng
    Jia, Jingdun
    KNOWLEDGE-BASED SYSTEMS, 2021, 232
  • [49] Feathers Dataset for Fine-Grained Visual Categorization
    Belko, Alina
    Dobratulin, Konstantin
    Kuznetsov, Andrey
    THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
  • [50] PFNet: a novel part fusion network for fine-grained visual categorization
    Liang, Jingyun
    Guo, Jinlin
    Guo, Yanming
    Lao, Songyang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 33397 - 33416