Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition

被引:2
|
作者
Sun, Jiayin [1 ,2 ,3 ]
Wang, Hong [4 ]
Dong, Qiulei [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[4] Univ Chinese Acad Sci, Coll Life Sci, Beijing 100049, Peoples R China
关键词
Transformers; Feature extraction; Task analysis; Image recognition; Training; Visualization; Computer vision; Open-set fine-grained image recognition; hierarchical attention; long-short term memory; TEMPORAL ATTENTION; DIFFICULTY;
D O I
10.1109/TCSVT.2023.3325001
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Triggered by the success of transformers in various visual tasks, the spatial self-attention mechanism has recently attracted more and more attention in the computer vision community. However, we empirically found that a typical vision transformer with the spatial self-attention mechanism could not learn accurate attention maps for distinguishing different categories of fine-grained images. To address this problem, motivated by the temporal attention mechanism in brains, we propose a hierarchical attention network for learning fine-grained feature representations, called HAN, where the features learnt by implementing a sequence of spatial self-attention operations corresponding to multiple moments are aggregated progressively. The proposed HAN consists of four modules: a self-attention backbone module for learning a sequence of features with self-attention operations, a spatial feature self-organizing module for facilitating the model training, a hierarchical aggregation module for aggregating the re-organized features via a Long Short-Term Memory network, and a context-aware module that is implemented as the forget block of the hierarchical aggregation module for preserving/forgetting the long-term memory by utilizing contextual information. Then, we propose a HAN-based method for open-set fine-grained recognition by integrating the proposed HAN network with a linear classifier, called HAN-OSFGR. Extensive experimental results on 3 fine-grained datasets and 2 coarse-grained datasets demonstrate that the proposed HAN-OSFGR outperforms 9 state-of-the-art open-set recognition methods significantly in most cases.
引用
收藏
页码:3891 / 3904
页数:14
相关论文
共 50 条
  • [41] Weakly Supervised Fine-grained Recognition in a Segmentation-attention Network
    Yu, Nannan
    Zhang, Wenfeng
    Cai, Huanhuan
    ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 324 - 329
  • [42] A hierarchical sampling based triplet network for fine-grained image classification
    He, Guiqing
    Li, Feng
    Wang, Qiyao
    Bai, Zongwen
    Xu, Yuelei
    PATTERN RECOGNITION, 2021, 115
  • [43] Fine-Grained Image Classification Based on Cross-Attention Network
    Zheng, Zhiwen
    Zhou, Juxiang
    Gan, Jianhou
    Luo, Sen
    Gao, Wei
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2022, 18 (01)
  • [44] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
    Fu, Jianlong
    Zheng, Heliang
    Mei, Tao
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
  • [45] Fine-grained attention for image caption generation
    Yan-Shuo Chang
    Multimedia Tools and Applications, 2018, 77 : 2959 - 2971
  • [46] Fine-grained attention for image caption generation
    Chang, Yan-Shuo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 2959 - 2971
  • [47] Fine-grained Recognition of Chinese Food Image Based on DenseNet with Attention Mechanism
    Hao, Ran
    Gao, Weidong
    Mi, Jihang
    Zhao, Zhenwei
    TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [48] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [49] Dynamic Position-aware Network for Fine-grained Image Recognition
    Wang, Shijie
    Li, Haojie
    Wang, Zhihui
    Ouyang, Wanli
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2791 - 2799
  • [50] Fine-Grained Radio Frequency Fingerprint Recognition Network Based on Attention Mechanism
    Zhang, Yulan
    Hu, Jun
    Jiang, Rundong
    Lin, Zengrong
    Chen, Zengping
    ENTROPY, 2024, 26 (01)