Vision Transformer (ViT)-based Applications in Image Classification

被引:7
作者
Huo, Yingzi [1 ]
Jin, Kai [2 ]
Cai, Jiahong [1 ]
Xiong, Huixuan [1 ]
Pang, Jiacheng [1 ]
机构
[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China
来源
2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年
关键词
CNN; image classification; token; vision transformer; Vision Reservoir;
D O I
10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [31] Gait-ViT: Gait Recognition with Vision Transformer
    Mogan, Jashila Nair
    Lee, Chin Poo
    Lim, Kian Ming
    Muthu, Kalaiarasi Sonai
    SENSORS, 2022, 22 (19)
  • [32] Refined Feature-Space Window Attention Vision Transformer for Image Classification
    Yoo D.
    Yoo J.
    Transactions of the Korean Institute of Electrical Engineers, 2024, 73 (06) : 1004 - 1011
  • [33] Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer
    Yuan Yuan
    Chen Minghui
    Ke Shuting
    Wang Teng
    He Longxi
    Lu Linjie
    Sun Hao
    Liu Jiannan
    CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2022, 49 (20):
  • [34] Hyperspectral Image Classification Based on Multi-stage Vision Transformer with Stacked Samples
    Chen, Xiaoyue
    Kamata, Sei-Ichiro
    Zhou, Weilian
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 441 - 446
  • [35] Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels
    Gul, Ahmet Gokberk
    Cetin, Oezdemir
    Reich, Christoph
    Flinner, Nadine
    Prangemeier, Tim
    Koeppl, Heinz
    MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
  • [36] Privacy-Preserving Image Classification Using Vision Transformer
    Qi, Zheng
    MaungMaung, AprilPyone
    Kinoshita, Yuma
    Kiya, Hitoshi
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
  • [37] MedViT: A robust vision transformer for generalized medical image classification
    Manzari, Omid Nejati
    Ahmadabadi, Hamid
    Kashiani, Hossein
    Shokouhi, Shahriar B.
    Ayatollahi, Ahmad
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [38] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
    Shiri, Mohammad
    Reddy, Monalika Padma
    Sun, Jiangwen
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301
  • [39] MASK-VIT: AN OBJECT MASK EMBEDDING IN VISION TRANSFORMER FOR FINE-GRAINED VISUAL CLASSIFICATION
    Su, Tong
    Ye, Shuo
    Song, Chengqun
    Cheng, Jun
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1626 - 1630
  • [40] AnisotropicBreast-ViT: Breast Cancer Classification in Ultrasound Images Using Anisotropic Filtering and Vision Transformer
    Diniz, Joao Otavio Bandeira
    Ribeiro, Neilson P.
    Dias, Domingos A., Jr.
    da Cruz, Luana B.
    da Silva, Giovanni L. F.
    Gomes, Daniel L., Jr.
    de Paiva, Anselmo C.
    Silva, Aristofanes C.
    INTELLIGENT SYSTEMS, BRACIS 2024, PT III, 2025, 15414 : 95 - 109