Research on Multi-Scale CNN and Transformer-Based Multi-Level Multi-Classification Method for Images

被引:1
|
作者
Gou, Quandeng [1 ]
Ren, Yuheng [2 ,3 ]
机构
[1] Neijiang Normal Univ, Informatizat Construct & Serv Ctr, Neijiang 641000, Peoples R China
[2] Xiamen Kunlu IoT Informat Technol Co Ltd, Xiamen 361021, Fujian, Peoples R China
[3] European Union Univ, Sch Business Econ, CH-1820 Montreux, Switzerland
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Task analysis; Convolution; Image classification; Convolutional neural networks; Vectors; Transformer; hierarchical characteristics of the model; multi-scale convolution; multi-level and multi-classification of images;
D O I
10.1109/ACCESS.2024.3433374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the vigorous development of digital creativity, the image data generated by it has exploded. To effectively manage massive image data, multi-level and multi-classification management of images has become very necessary. However, the existing hierarchical classification models of deep learning images are all based on convolutional neural networks, which have limitations in capturing the underlying global features. Different from this, Transformer, as a new neural network, captures the global context information through the attention mechanism, so it performs excellently in various visual recognition tasks. However, the existing work based on Transformer does not use the hierarchical structure information in the model, making it challenging to apply the model to multi-level and multi-classification tasks of images. Therefore, this paper proposes a new image multi-level and multi-classification model, which uses multi-scale CNN to effectively capture feature information at different scales and combines it with the Transformer's ability to extract global features. At the same time, the model makes full use of the hierarchical structure information in Transformer to better understand the complex relationship of images. We have done a lot of experiments on three data sets, CIFAR-10, CIFAR-100, and CUB-200-2011, and compared the performance with the existing multi-level and multi-classification model of images. The results show that our model has higher classification accuracy and better robustness.
引用
收藏
页码:103049 / 103059
页数:11
相关论文
共 50 条
  • [41] Spatial Feature Extraction for Hyperspectral Image Classification Based on Multi-scale CNN
    Song, Haifeng
    Yang, Weiwei
    Journal of Computers (Taiwan), 2020, 31 (04) : 174 - 186
  • [42] Adaptive segmentation based on multi-classification model for dermoscopy images
    Fengying Xie
    Yefen Wu
    Yang Li
    Zhiguo Jiang
    Rusong Meng
    Frontiers of Computer Science, 2015, 9 : 720 - 728
  • [43] Adaptive segmentation based on multi-classification model for dermoscopy images
    Xie, Fengying
    Wu, Yefen
    Li, Yang
    Jiang, Zhiguo
    Meng, Rusong
    FRONTIERS OF COMPUTER SCIENCE, 2015, 9 (05) : 720 - 728
  • [44] Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion
    Cheng, Hangyuan
    Guo, Xiaoxin
    Yang, Guangqi
    Chen, Cong
    Dong, Hongliang
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123
  • [45] A Novel Hybrid Model Based on CNN and Multi-scale Transformer for Extracting Water Bodies from High Resolution Remote Sensing Images
    Zhang, Qi
    Hu, Xiangyun
    Xiao, Yao
    GEOSPATIAL WEEK 2023, VOL. 10-1, 2023, : 889 - 894
  • [46] Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
    Gu, Quan
    Luan, Hongkang
    Huang, Kaixuan
    Sun, Yubao
    ELECTRONICS, 2024, 13 (05)
  • [47] MSFT: A multi-scale feature-based transformer model for arrhythmia classification
    Zhang, Xin
    Lin, Mingjun
    Hong, Yong
    Xiao, Hui
    Chen, Chaomin
    Chen, Hongwen
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
  • [48] Multi-Scale Feature Based Medical Image Classification
    Li, Bo
    Li, Wei
    Zhao, Dazhe
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 1182 - 1186
  • [49] Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images
    Tang, Suigu
    Yu, Xiaoyuan
    Cheang, Chak Fong
    Liang, Yanyan
    Zhao, Penghui
    Yu, Hon Ho
    Choi, I. Cheong
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [50] Multi-TranResUnet: An Improved Transformer Network for Solving Multi-Scale Issues in Image Segmentation
    Kang, Yajing
    Cheng, Shuai
    Guo, Liang
    Zheng, Chao
    Zhao, Jizhuang
    IEEE ACCESS, 2024, 12 : 129000 - 129011