Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition

被引：2

作者：

Sun, Jiayin ^{[1
,2
,3
]}

Wang, Hong ^{[4
]}

Dong, Qiulei ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China

[2] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[4] Univ Chinese Acad Sci, Coll Life Sci, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 05期

关键词：

Transformers; Feature extraction; Task analysis; Image recognition; Training; Visualization; Computer vision; Open-set fine-grained image recognition; hierarchical attention; long-short term memory; TEMPORAL ATTENTION; DIFFICULTY;

D O I：

10.1109/TCSVT.2023.3325001

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Triggered by the success of transformers in various visual tasks, the spatial self-attention mechanism has recently attracted more and more attention in the computer vision community. However, we empirically found that a typical vision transformer with the spatial self-attention mechanism could not learn accurate attention maps for distinguishing different categories of fine-grained images. To address this problem, motivated by the temporal attention mechanism in brains, we propose a hierarchical attention network for learning fine-grained feature representations, called HAN, where the features learnt by implementing a sequence of spatial self-attention operations corresponding to multiple moments are aggregated progressively. The proposed HAN consists of four modules: a self-attention backbone module for learning a sequence of features with self-attention operations, a spatial feature self-organizing module for facilitating the model training, a hierarchical aggregation module for aggregating the re-organized features via a Long Short-Term Memory network, and a context-aware module that is implemented as the forget block of the hierarchical aggregation module for preserving/forgetting the long-term memory by utilizing contextual information. Then, we propose a HAN-based method for open-set fine-grained recognition by integrating the proposed HAN network with a linear classifier, called HAN-OSFGR. Extensive experimental results on 3 fine-grained datasets and 2 coarse-grained datasets demonstrate that the proposed HAN-OSFGR outperforms 9 state-of-the-art open-set recognition methods significantly in most cases.

引用

页码：3891 / 3904

页数：14

共 50 条

[31] Learning to locate for fine-grained image recognition
Chen, Jiamin
Hu, Jianguo
Li, Shiren
COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 206
[32] CAMV: Class Activation Mapping Value Towards Open Set Fine-Grained Recognition
Dai, Wei
Diao, Wenhui
Sun, Xian
Zhang, Yue
Zhao, Liangjin
Li, Jun
Fu, Kun
IEEE ACCESS, 2021, 9 : 8167 - 8177
[33] Towards Fine-Grained Unknown Class Detection Against the Open-Set Attack Spectrum With Variable Legitimate Traffic
Zhao, Ziming
Li, Zhaoxuan
Xie, Xiaofei
Yu, Jiongchi
Zhang, Fan
Zhang, Rui
Chen, Binbin
Luo, Xiangyang
Hu, Ming
Ma, Wenrui
IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 3945 - 3960
[34] Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zha, Zheng-Jun
Liu, Daqing
Zhang, Hanwang
Zhang, Yongdong
Wu, Feng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 710 - 722
[35] Exploring Rich Semantics for Open-Set Action Recognition
Hu, Yufan
Gao, Junyu
Dong, Jianfeng
Fan, Bin
Liu, Hongmin
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5410 - 5421
[36] Learning Structured Relation Embeddings for Fine-Grained Fashion Attribute Recognition
Zhu, Shumin
Zou, Xingxing
Qian, Jianjun
Wong, Wai Keung
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1652 - 1664
[37] Part-Guided Relational Transformers for Fine-Grained Visual Recognition
Zhao, Yifan
Li, Jia
Chen, Xiaowu
Tian, Yonghong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9470 - 9481
[38] Deep LSAC for Fine-Grained Recognition
Lin, Di
Wang, Yi
Liang, Lingyu
Li, Ping
Chen, C. L. Philip
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 200 - 214
[39] TOAN: Target-Oriented Alignment Network for Fine-Grained Image Categorization With Few Labeled Samples
Huang, Huaxi
Zhang, Junjie
Yu, Litao
Zhang, Jian
Wu, Qiang
Xu, Chang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 853 - 866
[40] Hierarchical Self-Distilled Feature Learning for Fine-Grained Visual Categorization
Hu, Yutao
Jiang, Xiaolong
Liu, Xuhui
Luo, Xiaoyan
Hu, Yao
Cao, Xianbin
Zhang, Baochang
Zhang, Jun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, : 1 - 14

← 1 2 3 4 5 →