Vision Transformer (ViT)-based Applications in Image Classification

被引：7

作者：

Huo, Yingzi ^{[1
]}

Jin, Kai ^{[2
]}

Cai, Jiahong ^{[1
]}

Xiong, Huixuan ^{[1
]}

Pang, Jiacheng ^{[1
]}

机构：

[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China

[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China

来源：

2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年

关键词：

CNN; image classification; token; vision transformer; Vision Reservoir;

D O I：

10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.

引用

页码：135 / 140

页数：6

共 50 条

[1] CLASSIFICATION OF INTRACRANIAL HEMORRHAGE BASED ON CT-SCAN IMAGE WITH VISION TRANSFORMER (VIT) METHOD
Faiz, Muhammad Nur
Badriyah, Tessy
Kusuma, Selvia Ferdiana
2024 INTERNATIONAL ELECTRONICS SYMPOSIUM, IES 2024, 2024, : 454 - 459
[2] MIL-ViT: A multiple instance vision transformer for fundus image classification
Bi, Qi
Sun, Xu
Yu, Shuang
Ma, Kai
Bian, Cheng
Ning, Munan
He, Nanjun
Huang, Yawen
Li, Yuexiang
Liu, Hanruo
Zheng, Yefeng
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 97
[3] ViT-DualAtt: An efficient pornographic image classification method based on Vision Transformer with dual attention
Cai, Zengyu
Xu, Liusen
Zhang, Jianwei
Feng, Yuan
Zhu, Liang
Liu, Fangmei
ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (12): : 6698 - 6716
[4] The Application of Vision Transformer in Image Classification
He, Zhixuan
2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63
[5] SI-ViT: Shuffle instance-based Vision Transformer for pancreatic cancer ROSE image classification
Zhang, Tianyi
Feng, Youdan
Zhao, Yu
Lei, Yanli
Ying, Nan
Song, Fan
He, Yufang
Yan, Zhiling
Feng, Yunlu
Yang, Aiming
Zhang, Guanglei
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
[6] A ViT Vision Transformer Model for Rose Leaf Disease Classification
Saini, Archana
Guleria, Kalpna
Sharma, Shagun
2024 2ND WORLD CONFERENCE ON COMMUNICATION & COMPUTING, WCONF 2024, 2024,
[7] CWC-MP-MC Image-based breast tumor classification using an optimized Vision Transformer (ViT)
Kabir, Shahriar Mahmud
Bhuiyan, Mohammed Imamul Hassan
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
[8] Breast Ultrasound Image BI-RADS Classification Based on Vision Transformer
Wei, Yanbo
Ye, Junbo
Li, Xiaofeng
Zhao, Yuanyuan
Wang, Yanwei
INTERNATIONAL JOURNAL OF MULTIPHYSICS, 2024, 18 (02) : 32 - 39
[9] Transforming Alzheimer's Disease Diagnosis: Implementing Vision Transformer (ViT) for MRI Images Classification
Kurniasari, Dian
Pratama, Muhammad Dwi
Junaidi, Akmal
Faisol, Ahmad
JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA, 2025, 24 (01): : 130 - 152
[10] Image Classification of Tree Species in Relatives Based on Dual-Branch Vision Transformer
Wang, Qi
Dong, Yanqi
Xu, Nuo
Xu, Fu
Mou, Chao
Chen, Feixiang
FORESTS, 2024, 15 (12):

← 1 2 3 4 5 →