Adopting Attention and Cross-Layer Features for Fine-Grained Representation

被引:3
作者
Sun Fayou [1 ]
Ngo, Hea Choon [1 ]
Sek, Yong Wee [1 ]
机构
[1] Univ Teknikal Malaysia Melaka, Fac Informat & Commun Technol, Ctr Adv Comp Technol, Durian Tunggal 76100, Malacca, Malaysia
关键词
Feature extraction; Representation learning; Semantics; Transformers; Sun; Convolution; Task analysis; Associating cross-layer features; attention-based operations; self-attention; CLNET;
D O I
10.1109/ACCESS.2022.3195907
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-grained visual classification (FGVC) is challenging task due to discriminative feature representations. The attention-based methods show great potential for FGVC, which neglect that the deeply digging inter-layer feature relations have an impact on refining feature learning. Similarly, the associating cross-layer features methods achieve significant feature enhancement, which lost the long-distance dependencies between elements. However, most of the previous researches neglect that these two methods are mutually correlated to reinforce feature learning, which are independent of each other in related models. Thus, we adopt the respective advantages of the two methods to promote fine-gained feature representations. In this paper, we propose a novel CLNET network, which effectively applies attention mechanism and cross-layer features to obtain feature representations. Specifically, CL-NET consists of 1) adopting self-attention to capture long-rang dependencies for each element, 2) associating cross-layer features to reinforce feature learning,and 3) to cover more feature regions,we integrate attention-based operations between output and input. Experiments verify that CLNET yields new state-of-the-art performance on three widely used fine-grained benchmarks, including CUB-200-2011, Stanford Cars and FGVC-Aircraft. The url of our code is https://github.com/dlearing/CLNET.git.
引用
收藏
页码:82376 / 82383
页数:8
相关论文
共 39 条
  • [1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [2] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
  • [3] Selective Sparse Sampling for Fine-grained Image Recognition
    Ding, Yao
    Zhou, Yanzhao
    Zhu, Yi
    Ye, Qixiang
    Jiao, Jianbin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6598 - 6607
  • [4] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [5] Fayou S., 2022, INT J IMAGE GRAPH SI, V14, P15, DOI 10.5815/ijigsp.2022.01.02
  • [6] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
    Fu, Jianlong
    Zheng, Heliang
    Mei, Tao
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
  • [7] Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification from the Bottom Up
    Ge, Weifeng
    Lin, Xiangru
    Yu, Yizhou
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3029 - 3038
  • [8] He J, 2022, AAAI CONF ARTIF INTE, P852
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] He XT, 2017, Arxiv, DOI arXiv:1704.02792