Research on Multi-Scale CNN and Transformer-Based Multi-Level Multi-Classification Method for Images

被引:1
|
作者
Gou, Quandeng [1 ]
Ren, Yuheng [2 ,3 ]
机构
[1] Neijiang Normal Univ, Informatizat Construct & Serv Ctr, Neijiang 641000, Peoples R China
[2] Xiamen Kunlu IoT Informat Technol Co Ltd, Xiamen 361021, Fujian, Peoples R China
[3] European Union Univ, Sch Business Econ, CH-1820 Montreux, Switzerland
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Task analysis; Convolution; Image classification; Convolutional neural networks; Vectors; Transformer; hierarchical characteristics of the model; multi-scale convolution; multi-level and multi-classification of images;
D O I
10.1109/ACCESS.2024.3433374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the vigorous development of digital creativity, the image data generated by it has exploded. To effectively manage massive image data, multi-level and multi-classification management of images has become very necessary. However, the existing hierarchical classification models of deep learning images are all based on convolutional neural networks, which have limitations in capturing the underlying global features. Different from this, Transformer, as a new neural network, captures the global context information through the attention mechanism, so it performs excellently in various visual recognition tasks. However, the existing work based on Transformer does not use the hierarchical structure information in the model, making it challenging to apply the model to multi-level and multi-classification tasks of images. Therefore, this paper proposes a new image multi-level and multi-classification model, which uses multi-scale CNN to effectively capture feature information at different scales and combines it with the Transformer's ability to extract global features. At the same time, the model makes full use of the hierarchical structure information in Transformer to better understand the complex relationship of images. We have done a lot of experiments on three data sets, CIFAR-10, CIFAR-100, and CUB-200-2011, and compared the performance with the existing multi-level and multi-classification model of images. The results show that our model has higher classification accuracy and better robustness.
引用
收藏
页码:103049 / 103059
页数:11
相关论文
共 50 条
  • [31] Transformer based on multi-scale local feature for colon cancer histopathological image classification
    Fu, Zhibing
    Chen, Qingkui
    Wang, Mingming
    Huang, Chen
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
  • [32] Dilated CNN Design Approach for Extracting Multi-Scale Features in Radar Emitter Classification
    Guo, Enze
    Wu, Hao
    Guo, Ming
    Wu, Yinan
    Dong, Jian
    IEEE ACCESS, 2023, 11 : 129205 - 129216
  • [33] Research on image classification method based on improved multi-scale relational network
    Zheng, Wenfeng
    Liu, Xiangjun
    Yin, Lirong
    PEERJ COMPUTER SCIENCE, 2021, 7
  • [34] Research on image classification method based on improved multi-scale relational network
    Zheng W.
    Liu X.
    Yin L.
    PeerJ Computer Science, 2021, 7 : 1 - 21
  • [35] Fault Feature Extraction Method for Cascaded H-bridge Multi-level Inverter Based on Multi-scale OGLPE
    Zhang B.
    Kong L.
    Peng L.
    Mei T.
    Gaodianya Jishu/High Voltage Engineering, 2020, 46 (08): : 2732 - 2739
  • [36] MSATNet: multi-scale adaptive transformer network for motor imagery classification
    Hu, Lingyan
    Hong, Weijie
    Liu, Lingyu
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [37] LTUNet: A Lightweight Transformer-Based UNet with Multi-scale Mechanism for Skin Lesion Segmentation
    Guo, Huike
    Zhang, Han
    Li, Minghe
    Quan, Xiongwen
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 147 - 158
  • [38] Multi-Scale Alignment Domain Adaptation for Ship Classification in Multi-Resolution SAR Images
    Liu, Zhunga
    Li, Kun
    Wang, Longfei
    Zhang, Zuowei
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 4051 - 4062
  • [39] Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining
    Liu, Bin
    Fang, Siyan
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (30) : 22387 - 22404
  • [40] Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining
    Bin Liu
    Siyan Fang
    Neural Computing and Applications, 2023, 35 : 22387 - 22404