Adopting Attention and Cross-Layer Features for Fine-Grained Representation

被引：3

作者：

Sun Fayou ^{[1
]}

Ngo, Hea Choon ^{[1
]}

Sek, Yong Wee ^{[1
]}

机构：

[1] Univ Teknikal Malaysia Melaka, Fac Informat & Commun Technol, Ctr Adv Comp Technol, Durian Tunggal 76100, Malacca, Malaysia

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Feature extraction; Representation learning; Semantics; Transformers; Sun; Convolution; Task analysis; Associating cross-layer features; attention-based operations; self-attention; CLNET;

D O I：

10.1109/ACCESS.2022.3195907

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Fine-grained visual classification (FGVC) is challenging task due to discriminative feature representations. The attention-based methods show great potential for FGVC, which neglect that the deeply digging inter-layer feature relations have an impact on refining feature learning. Similarly, the associating cross-layer features methods achieve significant feature enhancement, which lost the long-distance dependencies between elements. However, most of the previous researches neglect that these two methods are mutually correlated to reinforce feature learning, which are independent of each other in related models. Thus, we adopt the respective advantages of the two methods to promote fine-gained feature representations. In this paper, we propose a novel CLNET network, which effectively applies attention mechanism and cross-layer features to obtain feature representations. Specifically, CL-NET consists of 1) adopting self-attention to capture long-rang dependencies for each element, 2) associating cross-layer features to reinforce feature learning,and 3) to cover more feature regions,we integrate attention-based operations between output and input. Experiments verify that CLNET yields new state-of-the-art performance on three widely used fine-grained benchmarks, including CUB-200-2011, Stanford Cars and FGVC-Aircraft. The url of our code is https://github.com/dlearing/CLNET.git.

引用

页码：82376 / 82383

页数：8

共 39 条

[1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[2] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
[3] Selective Sparse Sampling for Fine-grained Image Recognition
Ding, Yao
Zhou, Yanzhao
Zhu, Yi
Ye, Qixiang
Jiao, Jianbin
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6598 - 6607
[4] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[5] Fayou S., 2022, INT J IMAGE GRAPH SI, V14, P15, DOI 10.5815/ijigsp.2022.01.02
[6] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
Fu, Jianlong
Zheng, Heliang
Mei, Tao
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
[7] Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification from the Bottom Up
Ge, Weifeng
Lin, Xiangru
Yu, Yizhou
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3029 - 3038
[8] He J, 2022, AAAI CONF ARTIF INTE, P852
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] He XT, 2017, Arxiv, DOI arXiv:1704.02792

← 1 2 3 4 →