ZoomViT: an observation behavior-based fine-grained recognition scheme

被引:0
|
作者
Ma Z. [1 ]
Yang Y. [1 ]
Wang H. [2 ]
Huang L. [1 ]
Wei Z. [1 ]
机构
[1] Faculty of Information Science and Engineering, Ocean University of China, Songling Road, Shandong, Qingdao
[2] College of Computer and Cyber Security, Fujian Normal University, Xuefu South Road, Fuzhou
基金
中国国家自然科学基金;
关键词
Discriminative foreground; Fine-grained image recognition; Image classification; Local region feature; Observation behavior; Visual attention;
D O I
10.1007/s00521-024-09961-y
中图分类号
学科分类号
摘要
Fine-grained image recognition aims to distinguish many images with subtle differences and identify the sub-categories to which they belong. Recently, vision transformer (ViT) has achieved promising results in many computer vision tasks. In this paper, we introduce human observation behavior into ViT and propose a novel transformer-based network, named ZoomViT. We divide the fine-grained recognition into two steps "look closer" and "contrast." Firstly, looking closer is to observe finer local regions and multi-scale features, and avoid the adverse effect of background on recognition. We design the zoom-in module to track the attention flow by integrating the attention weights to zoom in the discriminative foreground regions. Subsequently, the straight image splitting like ViT may harm recognition adversely. Therefore, we design the zoom-out module combining overlapping cutting and downsampling to maintain the integrity of local neighboring structures and the running efficiency of the model in recognition. Finally, we propose to contrast the features of known sub-categories to supervise the model to learn subtle differences among different sub-categories. The consistency of features extracted from different batches increases over time; for this reason, we proposed a variable-length queue to store features from different batches to efficiently and fully conduct contrastive learning. We experimentally demonstrate the state-of-the-art performance of our model on four popular fine-grained benchmarks: CUB-200-2011, Stanford Dogs, NABirds, and iNat2017. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:12775 / 12789
页数:14
相关论文
共 50 条
  • [1] SwinFG: A fine-grained recognition scheme based on swin transformer
    Ma, Zhipeng
    Wu, Xiaoyu
    Chu, Anzhuo
    Huang, Lei
    Wei, Zhiqiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 244
  • [2] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [3] Fine-Grained Obfuscation Scheme Recognition on Binary Code
    Tian, Zhenzhou
    Mao, Hengchao
    Huang, Yaqian
    Tian, Jie
    Li, Jinrui
    DIGITAL FORENSICS AND CYBER CRIME, ICDF2C 2021, 2022, 441 : 215 - 228
  • [4] Representing Fine-Grained Co-Occurrences for Behavior-Based Fraud Detection in Online Payment Services
    Wang, Cheng
    Zhu, Hangyu
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (01) : 301 - 315
  • [5] Collaborative Representation based Fine-grained Species Recognition
    Chakraborti, Tapabrata
    McCane, Brendan
    Mills, Steven
    Pal, Umapada
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2016, : 42 - 47
  • [6] Recognition of fine-grained sow nursing behavior based on the SlowFast and hidden Markov models
    Li, Bo
    Xu, Weijie
    Chen, Tianming
    Cheng, Jinhua
    Shen, Mingxia
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 210
  • [7] Towards Fine-Grained Recognition: Joint Learning for Object Detection and Fine-Grained Classification
    Wang, Qiaosong
    Rasmussen, Christopher
    ADVANCES IN VISUAL COMPUTING, ISVC 2019, PT II, 2019, 11845 : 332 - 344
  • [8] FINE-GRAINED AND LAYERED OBJECT RECOGNITION
    Wu, Yang
    Zheng, Nanning
    Liu, Yuanliu
    Yuan, Zejian
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (02)
  • [9] SELECTIVE PARTS FOR FINE-GRAINED RECOGNITION
    Li, Dong
    Li, Yali
    Wang, Shengjin
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 922 - 926
  • [10] Deep LSAC for Fine-Grained Recognition
    Lin, Di
    Wang, Yi
    Liang, Lingyu
    Li, Ping
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 200 - 214