Multi-Scale Feature Transformer Based Fine-Grained Image Classification Method

被引:0
作者
Zhang T. [1 ]
Cai C. [1 ]
Luo X. [2 ]
Zhu Y. [3 ]
机构
[1] School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing
[2] China Mobile (Jiangxi) Virtual Reality Technology Company Limited, Nanchang
[3] China Branch of BRICS Institute of Future Networks, Shenzhen
来源
Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications | 2023年 / 46卷 / 04期
关键词
fine-grained feature; fine-grained image classification; long-tail distribution; Transformer;
D O I
10.13190/j.jbupt.2022-164
中图分类号
学科分类号
摘要
Aiming at the long-tail distribution problem of fine-grained image classification task, a multi-scale feature Transformer based fine-grained image classification method is proposed to protect the underlying and deep features and optimize the long-tail distribution. First, a hybrid data sampling method is designed to obtain the ternary data for optimizing the representation learning, long-tail distribution and fine-grained features. Then, the Transformer multi-scale feature optimization method is designed to optimize the feature learning process by the bottom feature comparison learning method and the deep feature balance learning method, respectively, to improve the category confusion and fine-grained feature extraction, and to increase the attention to the tail category while protecting the head category feature learning. Simulation results show that the proposed method can effectively improve the impact of the long-tail distribution in fine-grained image classification tasks, optimize the feature distribution, and improve classification accuracy. © 2023 Beijing University of Posts and Telecommunications. All rights reserved.
引用
收藏
页码:70 / 75
页数:5
相关论文
共 13 条
[1]  
HE J, CHEN J, LIU S, Et al., TransFG: A transformer architecture for fine-grained recognition, Proceedings of the 36th AAAI Conference on Artificial Intelligence, pp. 1174-1182, (2022)
[2]  
LIU X D, WANG L L, HAN X G., Transformer with peak suppression and knowledge guidance for fine-grained image recognition, Neurocomputing, 492, pp. 137-149, (2022)
[3]  
HU Y Q, JIN X, ZHANG Y, Et al., RAMS-Trans: recurrent attention multi-scale transformer for fine-grained image recognition, Proceedings of the 29th ACM International Conference on Multimedia, pp. 4239-4248, (2021)
[4]  
LIU M, ZHANG C, BAI H, Et al., Cross-part learning for fine-grained image classification, IEEE Transactions on Image Processing, 31, pp. 748-758, (2022)
[5]  
LIAO K Y, HUANG G, ZHENG Y L, Et al., Fine-grained classification of complementary attention diversity feature fusion network, Journal of Image and Graphics, 10, pp. 1-12, (2022)
[6]  
LIU Z, LIN Y T, CAO Y, Et al., Swin transformer: hierarchical vision transformer using shifted windows, Proceedings of the IEEE International Conference on Computer Vision, pp. 9992-10002, (2021)
[7]  
MULLER R, KORNBLITH S, HINTON G E., When does label smoothing help?, Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 1-16, (2019)
[8]  
WU X P, ZHAN C, LAI Y K, Et al., IP102: a large-scale benchmark dataset for insect pest recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8787-8796, (2019)
[9]  
NILSBACK M E, ZISSERMAN A., Automated flower classification over a large number of classes, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722-729, (2008)
[10]  
WAH C, BRANSON S, WELINDER P, Et al., The caltech-UCSD birds-200-2011 dataset