Learning Scale-Consistent Attention Part Network for Fine-Grained Image Recognition

被引:23
作者
Liu, Huabin [1 ]
Li, Jianguo [2 ]
Li, Dian [3 ]
See, John [4 ]
Lin, Weiyao [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
[2] Ant Financial Serv Grp, Beijing 101100, Peoples R China
[3] Tencent Technol Beijing Co Ltd, Beijing 100080, Peoples R China
[4] Heriot Watt Univ, Sch Math & Comp Sci, Putrajaya 62200, Malaysia
基金
中国国家自然科学基金;
关键词
Image recognition; Task analysis; Logic gates; Location awareness; Visualization; Training; Object detection; Fine-grained image recognition; scale-consistent; attention part;
D O I
10.1109/TMM.2021.3090274
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discriminative region localization and feature learning are crucial for fine-grained visual recognition. Existing approaches solve this issue by attention mechanism or part based methods while neglecting consistency between attention and local parts, as well as the rich relation information among parts. This paper proposes a Scale-consistent Attention Part Network (SCAPNet) to address that issue, which seamlessly integrates three novel modules: grid gate attention unit (gGAU), scale-consistent attention part selection (SCAPS), and part relation modeling (PRM). The gGAU module represents the grid region at a certain fine-scale with middle layer CNN features and produces hard attention maps with the lightweight Gumbel-Max based gate. The SCAPS module utilizes attention to guide part selection across multi-scales and keep the selection scale-consistent. The PRM module utilizes the self-attention mechanism to build the relationship among parts based on their appearance and relative geo-positions. SCAPNet can be learned in an end-to-end way and demonstrates state-of-the-art accuracy on several publicly available fine-grained recognition datasets (CUB-200-2011, FGVC-Aircraft, Veg200, and Fru92).
引用
收藏
页码:2902 / 2913
页数:12
相关论文
共 53 条
  • [41] Deep Parametric Continuous Convolutional Neural Networks
    Wang, Shenlong
    Suo, Simon
    Ma, Wei-Chiu
    Pokrovsky, Andrei
    Urtasun, Raquel
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2589 - 2597
  • [42] Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
    Wang, Yaming
    Morariu, Vlad I.
    Davis, Larry S.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4148 - 4157
  • [43] Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval
    Wei, Xiu-Shen
    Luo, Jian-Hao
    Wu, Jianxin
    Zhou, Zhi-Hua
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (06) : 2868 - 2881
  • [44] Spatial-aware Graph Relation Network for Large-scale Object Detection
    Xu, Hang
    Jiang, ChenHan
    Liang, Xiaodan
    Li, Zhenguo
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9290 - 9299
  • [45] Learning to Navigate for Fine-Grained Classification
    Yang, Ze
    Luo, Tiange
    Wang, Dong
    Hu, Zhiqiang
    Gao, Jun
    Wang, Liwei
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 438 - 454
  • [46] Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
    Yu, Chaojian
    Zhao, Xinyi
    Zheng, Qi
    Zhang, Peng
    You, Xinge
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 595 - 610
  • [47] Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization
    Zhang, Lianbo
    Huang, Shaoli
    Liu, Wei
    Tao, Dacheng
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8330 - 8339
  • [48] Zhang N, 2014, LECT NOTES COMPUT SC, V8689, P834, DOI 10.1007/978-3-319-10590-1_54
  • [49] Picking Neural Activations for Fine-Grained Recognition
    Zhang, Xiaopeng
    Xiong, Hongkai
    Zhou, Wengang
    Lin, Weiyao
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (12) : 2736 - 2750
  • [50] Diversified Visual Attention Networks for Fine-Grained Object Classification
    Zhao, Bo
    Wu, Xiao
    Feng, Jiashi
    Peng, Qiang
    Yan, Shuicheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) : 1245 - 1256