Learning Scale-Consistent Attention Part Network for Fine-Grained Image Recognition

被引：23

作者：

Liu, Huabin ^{[1
]}

Li, Jianguo ^{[2
]}

Li, Dian ^{[3
]}

See, John ^{[4
]}

Lin, Weiyao ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China

[2] Ant Financial Serv Grp, Beijing 101100, Peoples R China

[3] Tencent Technol Beijing Co Ltd, Beijing 100080, Peoples R China

[4] Heriot Watt Univ, Sch Math & Comp Sci, Putrajaya 62200, Malaysia

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2022年 / 24卷

基金：

中国国家自然科学基金;

关键词：

Image recognition; Task analysis; Logic gates; Location awareness; Visualization; Training; Object detection; Fine-grained image recognition; scale-consistent; attention part;

D O I：

10.1109/TMM.2021.3090274

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Discriminative region localization and feature learning are crucial for fine-grained visual recognition. Existing approaches solve this issue by attention mechanism or part based methods while neglecting consistency between attention and local parts, as well as the rich relation information among parts. This paper proposes a Scale-consistent Attention Part Network (SCAPNet) to address that issue, which seamlessly integrates three novel modules: grid gate attention unit (gGAU), scale-consistent attention part selection (SCAPS), and part relation modeling (PRM). The gGAU module represents the grid region at a certain fine-scale with middle layer CNN features and produces hard attention maps with the lightweight Gumbel-Max based gate. The SCAPS module utilizes attention to guide part selection across multi-scales and keep the selection scale-consistent. The PRM module utilizes the self-attention mechanism to build the relationship among parts based on their appearance and relative geo-positions. SCAPNet can be learned in an end-to-end way and demonstrates state-of-the-art accuracy on several publicly available fine-grained recognition datasets (CUB-200-2011, FGVC-Aircraft, Veg200, and Fru92).

引用

页码：2902 / 2913

页数：12

共 53 条

[41] Deep Parametric Continuous Convolutional Neural Networks
Wang, Shenlong
Suo, Simon
Ma, Wei-Chiu
Pokrovsky, Andrei
Urtasun, Raquel
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2589 - 2597
[42] Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
Wang, Yaming
Morariu, Vlad I.
Davis, Larry S.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4148 - 4157
[43] Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval
Wei, Xiu-Shen
Luo, Jian-Hao
Wu, Jianxin
Zhou, Zhi-Hua
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (06) : 2868 - 2881
[44] Spatial-aware Graph Relation Network for Large-scale Object Detection
Xu, Hang
Jiang, ChenHan
Liang, Xiaodan
Li, Zhenguo
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9290 - 9299
[45] Learning to Navigate for Fine-Grained Classification
Yang, Ze
Luo, Tiange
Wang, Dong
Hu, Zhiqiang
Gao, Jun
Wang, Liwei
[J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 438 - 454
[46] Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
Yu, Chaojian
Zhao, Xinyi
Zheng, Qi
Zhang, Peng
You, Xinge
[J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 595 - 610
[47] Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization
Zhang, Lianbo
Huang, Shaoli
Liu, Wei
Tao, Dacheng
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8330 - 8339
[48] Zhang N, 2014, LECT NOTES COMPUT SC, V8689, P834, DOI 10.1007/978-3-319-10590-1_54
[49] Picking Neural Activations for Fine-Grained Recognition
Zhang, Xiaopeng
Xiong, Hongkai
Zhou, Wengang
Lin, Weiyao
Tian, Qi
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (12) : 2736 - 2750
[50] Diversified Visual Attention Networks for Fine-Grained Object Classification
Zhao, Bo
Wu, Xiao
Feng, Jiashi
Peng, Qiang
Yan, Shuicheng
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) : 1245 - 1256

← 1 2 3 4 5 6 →