Research on Multi-Scale CNN and Transformer-Based Multi-Level Multi-Classification Method for Images

被引:1
|
作者
Gou, Quandeng [1 ]
Ren, Yuheng [2 ,3 ]
机构
[1] Neijiang Normal Univ, Informatizat Construct & Serv Ctr, Neijiang 641000, Peoples R China
[2] Xiamen Kunlu IoT Informat Technol Co Ltd, Xiamen 361021, Fujian, Peoples R China
[3] European Union Univ, Sch Business Econ, CH-1820 Montreux, Switzerland
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Task analysis; Convolution; Image classification; Convolutional neural networks; Vectors; Transformer; hierarchical characteristics of the model; multi-scale convolution; multi-level and multi-classification of images;
D O I
10.1109/ACCESS.2024.3433374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the vigorous development of digital creativity, the image data generated by it has exploded. To effectively manage massive image data, multi-level and multi-classification management of images has become very necessary. However, the existing hierarchical classification models of deep learning images are all based on convolutional neural networks, which have limitations in capturing the underlying global features. Different from this, Transformer, as a new neural network, captures the global context information through the attention mechanism, so it performs excellently in various visual recognition tasks. However, the existing work based on Transformer does not use the hierarchical structure information in the model, making it challenging to apply the model to multi-level and multi-classification tasks of images. Therefore, this paper proposes a new image multi-level and multi-classification model, which uses multi-scale CNN to effectively capture feature information at different scales and combines it with the Transformer's ability to extract global features. At the same time, the model makes full use of the hierarchical structure information in Transformer to better understand the complex relationship of images. We have done a lot of experiments on three data sets, CIFAR-10, CIFAR-100, and CUB-200-2011, and compared the performance with the existing multi-level and multi-classification model of images. The results show that our model has higher classification accuracy and better robustness.
引用
收藏
页码:103049 / 103059
页数:11
相关论文
共 50 条
  • [41] A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining
    Shi, Bowen
    Jiang, Dongsheng
    Zhang, Xiaopeng
    Li, Han
    Dai, Wenrui
    Zou, Junni
    Xiong, Hongkai
    Tian, Qi
    COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 624 - 639
  • [42] Multi-scale compromise and multi-level correlation in complex systems
    Li, J
    Ge, W
    Zhang, J
    Kwauk, M
    CHEMICAL ENGINEERING RESEARCH & DESIGN, 2005, 83 (A6): : 574 - 582
  • [43] Multi-Instance Multi-Scale CNN for Medical Image Classification
    Li, Shaohua
    Liu, Yong
    Sui, Xiuchao
    Chen, Cheng
    Tjio, Gabriel
    Ting, Daniel Shu Wei
    Goh, Rick Siow Mong
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 531 - 539
  • [44] Multi-level and Multi-scale Spatial and Spectral Fusion CNN for Hyperspectral Image Super-resolution
    Han, Xian-Hua
    Zheng, YinQiang
    Chen, Yen-Wei
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4330 - 4339
  • [45] TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
    Ju, Xiaochen
    Zhao, Xinxin
    Qian, Shengsheng
    MATHEMATICS, 2022, 10 (13)
  • [46] ScaleFormer: Transformer-based speech enhancement in the multi-scale time domain
    Wu, Tianci
    He, Shulin
    Zhang, Hui
    Zhang, XueLiang
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2448 - 2453
  • [47] An Effective Multi-classification Method for NHL Pathological Images
    Jiang, Huiyan
    Li, Zhongkuan
    Li, Siqi
    Zhou, Fucai
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 763 - 768
  • [48] Crop classification based on G-CNN using multi-scale remote sensing images
    Meng, Mengmeng
    Zhang, Kaixin
    Huang, Yabo
    Li, Ning
    Guo, Zhengwei
    Zhou, Zhimin
    REMOTE SENSING LETTERS, 2024, 15 (09) : 941 - 950
  • [49] Multi-Task Learning Model Based on Multi-Scale CNN and LSTM for Sentiment Classification
    Jin, Ning
    Wu, Jiaxian
    Ma, Xiang
    Yan, Ke
    Mo, Yuchang
    IEEE ACCESS, 2020, 8 : 77060 - 77072
  • [50] Road Recognition Based on Multi-scale Convolutional Network with Multi-level Feature Fusion
    Li, Ye
    Guo, Lili
    Xu, Lele
    Wang, Xianfeng
    Jin, Shan
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069