Research on Multi-Scale CNN and Transformer-Based Multi-Level Multi-Classification Method for Images

被引:1
|
作者
Gou, Quandeng [1 ]
Ren, Yuheng [2 ,3 ]
机构
[1] Neijiang Normal Univ, Informatizat Construct & Serv Ctr, Neijiang 641000, Peoples R China
[2] Xiamen Kunlu IoT Informat Technol Co Ltd, Xiamen 361021, Fujian, Peoples R China
[3] European Union Univ, Sch Business Econ, CH-1820 Montreux, Switzerland
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Task analysis; Convolution; Image classification; Convolutional neural networks; Vectors; Transformer; hierarchical characteristics of the model; multi-scale convolution; multi-level and multi-classification of images;
D O I
10.1109/ACCESS.2024.3433374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the vigorous development of digital creativity, the image data generated by it has exploded. To effectively manage massive image data, multi-level and multi-classification management of images has become very necessary. However, the existing hierarchical classification models of deep learning images are all based on convolutional neural networks, which have limitations in capturing the underlying global features. Different from this, Transformer, as a new neural network, captures the global context information through the attention mechanism, so it performs excellently in various visual recognition tasks. However, the existing work based on Transformer does not use the hierarchical structure information in the model, making it challenging to apply the model to multi-level and multi-classification tasks of images. Therefore, this paper proposes a new image multi-level and multi-classification model, which uses multi-scale CNN to effectively capture feature information at different scales and combines it with the Transformer's ability to extract global features. At the same time, the model makes full use of the hierarchical structure information in Transformer to better understand the complex relationship of images. We have done a lot of experiments on three data sets, CIFAR-10, CIFAR-100, and CUB-200-2011, and compared the performance with the existing multi-level and multi-classification model of images. The results show that our model has higher classification accuracy and better robustness.
引用
收藏
页码:103049 / 103059
页数:11
相关论文
共 50 条
  • [21] Crop classification based on G-CNN using multi-scale remote sensing images
    Meng, Mengmeng
    Zhang, Kaixin
    Huang, Yabo
    Li, Ning
    Guo, Zhengwei
    Zhou, Zhimin
    REMOTE SENSING LETTERS, 2024, 15 (09) : 941 - 950
  • [22] Transformer-based multi-level attention integration network for video saliency prediction
    Rui Tan
    Minghui Sun
    Yanhua Liang
    Multimedia Tools and Applications, 2025, 84 (13) : 11833 - 11854
  • [23] Multi-Scale Feature Transformer Based Fine-Grained Image Classification Method
    Zhang T.
    Cai C.
    Luo X.
    Zhu Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (04): : 70 - 75
  • [24] MTT: Multi-Scale Temporal Transformer for Skeleton-Based Action Recognition
    Kong, Jun
    Bian, Yuhang
    Jiang, Min
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 528 - 532
  • [25] AMSFormer: A transformer with adaptive multi-scale partitioning and multi-level spectral filtering for time-series forecasting
    Liu, Honghao
    Diao, Yining
    Sun, Ke
    Wan, Zhaolin
    Li, Zhiyang
    NEUROCOMPUTING, 2025, 637
  • [26] Transformer-Based Multi-Scale Feature Integration Network for Video Saliency Prediction
    Zhou, Xiaofei
    Wu, Songhe
    Shi, Ran
    Zheng, Bolun
    Wang, Shuai
    Yin, Haibing
    Zhang, Jiyong
    Yan, Chenggang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7696 - 7707
  • [27] Pedestrian Re-Identification Based on CNN and TransFormer Multi-scale Learning
    Chen, Ying
    Kuang, Cheng
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (06) : 2256 - 2263
  • [28] TMA-Net: A Transformer-Based Multi-Scale Attention Network for Surgical Instrument Segmentation
    Yang, Lei
    Wang, Hongyong
    Gu, Yuge
    Bian, Guibin
    Liu, Yanhong
    Yu, Hongnian
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2023, 5 (02): : 323 - 334
  • [29] Seismic Data Interpolation Based on Multi-Scale Transformer
    Guo, Yuanqi
    Fu, Lihua
    Li, Hongwei
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [30] Weld TOFD defect classification method based on multi-scale CNN and cascaded focused attention
    Tang, Donglin
    Zhang, Junhui
    Wang, Pingjie
    He, Yuanyuan
    JOURNAL OF MANUFACTURING PROCESSES, 2025, 138 : 157 - 168